Dev Tips & Tricks

Building a language interpreter in JavaScript - Part 3

Back to posts

Welcome to part 3 of building an interpreter in pure JavaScript. This is the final part of a journey to build an interpreter in pure JavaScript.

In the previous part, we built a parser for our tokens for the programming language we are making - printly. We defined grammar for our language and a way to structure the statements which we can use.

In this part, we will use the statement structure we designed and interpret it to make our language perform some actual work.

Yep, this part is all about building an interpreter.

Cleanup

Before we start, let's do some cleanup on our index.js code. Let's move the parser part into a separate file.

This is how our index.js should look like:

// Code which we want to parse
const code = `i = 5;`;

// Import the lexer
const analyseCode = require('./lexer-analyser');

// Run the lexer
const tokens = analyseCode(code);

// Import the parser
const parseTokens = require('./parser-analyser');

// Run the parser
const statements = parseTokens(tokens);

// Finally we output the statements we parsed.
console.dir(statements, { depth: null });

And we will move the parsing code into a separate file parser-analyser.js:

// We include grammar here.
// Grammar exports the very first rule: LineStatement
// That means that parseGrammar is actually the same as the LineStatement constant.
const parseGrammar = require('./grammar');
const TokenReader = require('./token-reader');

const parseTokens = tokens => {
    // Create a reader for our tokens.
    const reader = new TokenReader(tokens);

    const statements = [];

    while (reader.hasNext()) {
        // We parse grammar until we have the next token.
        const statement = parseGrammar(reader);

        if (statement) {
            // Our statement was parsed successfully,
            // so we add it to the list of statements.
            statements.push(statement);
            continue;
        }

        // Something went wrong here, we couldn't parse our statement here
        // so our language needs to throw a syntax error.
        let token = reader.hasNext() ? reader.get() : reader.getLastToken();
        throw new Error(`Syntax error on ${token.line}:${token.character} for "${token.value}". Expected an assignment, function call or an if statement.`);
    }

    // Return the parsed statements
    return statements;
};

module.exports = parseTokens;

This will allow us to work easily on the interpreter.

You can follow this part in the code by checking out part3-chapter1 in the repository.

Building an interpreter

Interpreted languages like Python, PHP, Ruby, and JavaScript also have a few more steps which include optimizations and even compilation into an intermediary language or bytecode which is easy for the interpreter to run, we are keeping things simple so we will just run the actions directly and return the result.

We will interpret each of the statements and return the result from the final statement.

To be able to do that, we will make a function:

// Take an array of statements and interpret
// each of the statements and use the passed vm (we will explain this below)
const interpretStatements = (statements, vm) => {
    let result = null; // Set initial result for null

    for(const statement of statements) {
        // Interpret the statement and set the result.
        result = interpretStatement(statement, vm);
    }

    // Return last set result.
    return result;
};

// We define a dummy function for now
const interpretStatement = (statement, vm) => null;

In the function, you see that we used an additional variable vm. This is a variable that will hold the state of our program after statements work on it. In other words, it will simulate the purpose of a "virtual machine" or vm.

This variable will hold all necessary data for our program to work correctly. So, things like functions, our program can call and all variables our program set or changed.

Here is how we will structure the vm variable:

const vm = {
    variables: {},
    functions: {}
};

In more complicated languages where we define functions and a lot of nesting and switching, we would also need to define a stack and push the previous state of variables in it and other data and pop it when the function returns. Our language doesn't have a way to define functions so we do not need it.

Now that we have a vm defined, let's start with statement interpretation. We will use statement type to determine how we will interpret the statement itself.

We'll start with the assignment statement.

// Interpret the statement itself with the passed vm
const interpretStatement = (statement, vm) => {
    switch(statement.type) { // We first check the statement type and run the corresponding function.
        case 'assignment': 
            return interpretAssignment(statement, vm);
        case 'function':
            return interpretFunctionCall(statement, vm);
        case 'if':
            return interpretIf(statement, vm);
    }

    // We didn't get the correct type here of the statement so
    // that means something is wrong and we need to throw an error.
    throw new Error(`Invalid statement type: ${statement.type}`);
};

// We define other functions as dummy functions for now.
const interpretAssignment = (statement, vm) => 'To be implemented';
const interpretFunctionCall = (statement, vm) => 'To be implemented';
const interpretIf = (statement, vm) => 'To be implemented';

Interpreting Assignment statement

The assignment statement has the following format for code a = 5;:

{
    type: 'assignment',
    name: 'a',
    expression: {
        type: 'unary',
        withNot: false,
        value: { type: 'number', value: '5' }
    }
}

Therefore, we only need to set the variable requested to the vm.

// Assignment just sets a variable as a result of the expression.
const interpretAssignment = (statement, vm) => {
    vm.variables[statement.name] = interpretExpression(statement.expression, vm);

    // We also return the set variable as a statement result.
    // This is not strictly necessary, but we want
    // every statement to return something.
    return vm.variables[statement.name];
};


As a part of this, we also need to interpret an expression. We will implement that too. Expressions can be infinitely deep which is bad for recursion but in our case, we should not reach the recursion limit unless we have a super complicated expression with a lot of nesting, which nobody would normally write.

const interpretExpresion = (expression, vm) => {
    // Similar to statements, we interpret expressions
    // based on the type.
    switch (expression.type) { 
        case 'string':
            // For strings we just return them because
            // they are already ready.
            return expression.value;
        case 'number':
            // For numbers we convert them to actual numbers
            // so that they can be used further in expressions.
            return parseFloat(expression.value);
        case 'variable': // For variables and the rest we will delegate them to their functions
            return interpretVariableRetrieval(expression, vm);
        case 'function':
            return interpretFunctionCall(expression, vm);
        case 'unary':
            return interpretUnaryOperation(expression, vm);
        case 'operation':
            return interpretBinaryOperation(expression, vm);
    }

    // We got the wrong type here so we throw an error
    // because something went wrong.
    throw new Error(`Invalid type: ${expression.type}`);
};

// Dummy functions for now
const interpretVariableRetrieval = (expression, vm) => 'Not implemented';
const interpretFunctionCall = (expression, vm) => 'Not implemented';
const interpretUnaryOperation = (expression, vm) => 'Not implemented';
const interpretBinaryOperation = (expression, vm) => 'Not implemented';

We directly implemented interpreting the simplest expressions for strings and numbers, but other expressions need some more work.  Accordingly, we need to implement a function for each of them:

Variable

{ type: 'variable', name: 'variable_name' }

Function:

const interpretVariableRetrieval = (expression, vm) => {
    if (!(expression.name in vm.variables)) {
        // Variable is not defined so this is a runtime error that we need to throw.
        throw new Error(`Runtime Error. Variable '${expression.name}' does not exist.`);
    }

    // We return the variable's value from VM.
    return vm.variables[expression.name];
};

Function

{
  type: 'function',
  name: 'function_name',
  parameters: [ ...expressions ]
}

Function:

const interpretFunctionCall = (expression, vm) => {
    if (!(expression.name in vm.functions)) {
        // Function is not defined so this is a runtime error so we throw it.
        throw new Error(`Runtime Error. Function '${expression.name}' is not defined.`);
    }

    // We need to interpret an expression for each of the function parameters
    // so we perform it using a map call.
    const parameters = expression.parameters.map(
        parameter => interpretExpression(parameter, vm)
    );

    // And we call a function from VM an pass parameters by a spread operator.
    return vm.functions[expression.name](...parameters);
};

Operation

{
    type: 'operation',
    operation: '+',
    left: {
        ...expression
    },
    right: {
       ...expression
    }
}

Function:

const interpretBinaryOperation = (expression, vm) => {
    // We interpret the expression for the left side of the operation
    const leftValue = interpretExpression(expression.left, vm);
    // And now for the right side
    const rightValue = interpretExpression(expression.right, vm);

    // And based on the operation we perform the same operation on the
    // left and right side to get the result we need.
    switch(expression.operation) { 
        case '+':
            return leftValue + rightValue;
        case '-':
            return leftValue - rightValue;
        case '*':
            return leftValue * rightValue;
        case '/':
            return leftValue / rightValue;
        case '&&':
            return leftValue && rightValue;
        case '||':
            return leftValue || rightValue;
        case '>':
            return leftValue > rightValue;
        case '<':
            return leftValue < rightValue;
        case '<=':
            return leftValue <= rightValue;
        case '>=':
            return leftValue >= rightValue;
        case '==':
            return leftValue == rightValue;
        case '!=':
            return leftValue != rightValue;
    }

    // We didn't get a operation we expect so we throw an exception here.
    throw new Error(`Invalid operation requested: ${expression.operation}`);
};

Unary

{
    type: 'unary',
    withNot: false,
    value: { ...expression }
}

Function:

const interpretUnaryOperation = (expression, vm) => {
    // We interpret the value we need
    const value = interpretExpression(expression.value, vm);

    // If the value is with ! we also apply that before passing it
    // back.
    return expression.withNot ? !value : value;
};

Implementing If statement

If the statement is pretty easy, once we have other things in place, because it is just a conditional interpretation of its own statements.

Here's how the if statement looks like:

{
    type: 'if',
    check: {
        ...expression
    },
    statements: [ ...statements ]
}

And the implementation is:

const interpretIf = (statement, vm) => {
    // Interpret the check expression we are checking for
    const checkValue = interpretExpression(statement.check, vm);

    if (checkValue) {
        // Value is true so we interpret the if's own statements
        // and return the value.
        return interpretStatements(statement.statements, vm);
    }

    // If check failed so we just return null.
    return null;
};

Implementing function statement

We have this part already. When did we implement it? Back when we implemented the expression we also implemented the interpretFunctionCall.

Our statement format is the same:

{ 
    type: 'function', 
    name: 'function_name', 
    parameters: [ ...expressions ] 
}

So, we just reuse the interpretFunctionCall, and we are done.

Putting it all together

Finally, we just need to do some slight modifications to our index.js to complete everything:

// Code which we want to parse
const code = `
world = 'World';
print('Hello ' + world);
`;

// Import the lexer
const analyseCode = require('./lexer-analyser');

// Run the lexer
const tokens = analyseCode(code);

// Import the parser
const parseTokens = require('./parser-analyser');

// Run the parser
const statements = parseTokens(tokens);

// Import the interpreter
const interpret = require('./interpreter');

// We create a virtual state machine object
const vm = {
    variables: {},
    functions: {
        print(message) { // We add a print function so that we can call a function from our code.
            console.log('MESSAGE:', message); 
        },
        pow(x, y) { // We also add a function which returns something for an expression.
            return Math.pow(x, y);
        }
    }
};

// Interpret the statements and return last result.
const result = interpret(statements, vm);

// And finally we output the result
console.log('Code we ran:');
console.log(code);
console.log('Result:')
console.dir(result, {depth: null});
console.log('Final VM State:', vm);

You can get to this part of the code by checking out the branch part3-chapter2.

And as an output we get:

MESSAGE: Hello World
Code we ran:

world = 'World';
print('Hello ' + world);

Result:
undefined
Final VM State: {
  variables: { world: 'World' },
  functions: { print: [Function: print], pow: [Function: pow] }
}

Our result is undefined because print function returns nothing. If we try this code:

pow(2, 5);

We get the following output:

Code we ran:

pow(2, 5);

Result:
32
Final VM State: {
  variables: {},
  functions: { print: [Function: print], pow: [Function: pow] }
}

And our interpreter is finished! We kept things separated and modular so that you can hopefully use this to build your own language or even extend printly.

And one last thing; we mentioned back in part 1 that we will also reimplement this in an easier way. We will leave that for the bonus part next time.

Hope you had a great learning experience and hope that you understand how languages are parsed way better by this point.

Share with
Terms of Use | Privacy Policy