Skip to content

Pulsar

Bali uses an interpreter called Pulsar. It is a fairly fast interpreter that uses an unorthodox mix of a valuespace (called the "stack" internally, as in a stack of values) as well as special-purpose registers for various purposes like passing arguments to functions, and taking out the return value from one.

Table of Contents

Design

Bali, like most JavaScript engines, starts off by lowering the abstract syntax-tree into a bytecode format it calls MIR.

Prior to 0.7.2, Bali would do an incredibly wasteful pass of generating the bytecode structures, emitting it as a string, then parsing that string into the VM's bytecode structures. Since that version, Bali instead converts the bytecode structures directly into the VM's structures, saving a lot of unnecessary memory allocations.

The bytecode format originally started off its life as part of the Mirage project. However, ever since the VM was moved into the source tree from Mirage, the two formats have diverged radically and are no longer compatible with one another, despite looking fairly similar. Bali does, though, carry a lot of legacy baggage from this as Mirage was supposed to be as agnostic as possible, while Bali's VM is strictly focused on JavaScript. More on this will be more obvious to you soon.

After 0.7.5, Pulsar uses a dispatch table instead of the massive switch case used earlier.

From codegen to execution

Let's take a very simple JavaScript program:

console.log("Hello, world!")

If we use Balde with the --dump-bytecode flag, we can check out what the lowering mechanism generates for this code:

# Bytecode generated by Bali
# Bali is a JavaScript engine under the Ferus project.
# For more information, visit https://github.com/ferus-web/bali
# Developed by the Ferus Authors for the Ferus Project

# Clause/CodeModule "String"
# Operations: 1
CLAUSE String
        1 CALL BALI_STRING
END String

# Clause/CodeModule "atob"
# Operations: 1
CLAUSE atob
        1 CALL BALI_ATOB
END atob

# Clause/CodeModule "btoa"
# Operations: 1
CLAUSE btoa
        1 CALL BALI_BTOA
END btoa

# Clause/CodeModule "encodeURI"
# Operations: 1
CLAUSE encodeURI
        1 CALL BALI_ENCODEURI
END encodeURI

# Clause/CodeModule "BigInt"
# Operations: 1
CLAUSE BigInt
        1 CALL BALI_BIGINT
END BigInt

# Clause/CodeModule "parseInt"
# Operations: 1
CLAUSE parseInt
        1 CALL BALI_PARSEINT
END parseInt

# Clause/CodeModule "outer"
# Operations: 36
CLAUSE outer
        1 LDUD 0
        2 LDF 1 nan
        3 LDF 5 inf
        4 LDB 2 true
        5 LDB 3 false
        6 LDN 4
        7 CFLD 7 0 "@bali_object_type"
        8 CFLD 8 0 "@bali_object_type"
        9 CFLD 9 0 "@bali_object_type"
        10 CFLD 10 0 "@bali_object_type"
        11 CFLD 11 0 "@bali_object_type"
        12 CFLD 12 0 "@bali_object_type"
        13 CFLD 13 0 "@bali_object_type"
        14 CFLD 14 0 "@bali_object_type"
        15 CFLD 15 0 "@bali_object_type"
        16 CFLD 16 0 "@bali_object_type"
        17 LDS 17 "Hello, world!"
        18 PARG 17
        19 CALL BALI_CONSTRUCTOR_STRING
        20 RARG
        21 RREG 17 0
        22 ZRETV
        23 RARG
        24 LDN 18
        25 LDUI 20 8
        26 PARG 20
        27 LDUI 21 18
        28 PARG 21
        29 LDS 21 "log"
        30 PARG 21
        31 CALL BALI_RESOLVEFIELD
        32 RARG
        33 PARG 17
        34 INVK 18
        35 RARG
        36 ZRETV
END outer

That's quite a mouthful, isn't it? Let us ignore all clauses apart from outer, for they exist just for some of the runtime's features to function properly. By the way, this is in debug mode, where the emitter generates additional comments. The bytecode that'd be printed would be much more condensed otherwise.

Bootstrapping

Instruction 1 loads undefined at position 0. This is where all failed identifier indexing attempts during codegen point to. Instruction 2 loads NaN at position 1, and so on.

This is the part of the bytecode we call "bootstrapping". It essentially preps up the VM to handle what is to come.

It also creates a field called @bali_object_type in a lot of Object(s) created discreetly in the native initialization phase. This is an internal tag used by the engine.