> function f(){f()}f()
Uncaught exception: [RuntimeError]: Call stack size limit exceeded
-> f
1234 more calls
-> (global execution context)
> function a(x){if(x>0){a(x-1)}else{throw Error()}}function b(x){if(x>0){b(x-1)}else{a(5)}}function c(){b(2)}c()
Uncaught exception: [Error]
-> a
5 more calls
-> b
-> b
-> b
-> c
-> (global execution context)
So far we have three different syntax highlighters for LibJS:
- js's Line::Editor stylization
- JS::MarkupGenerator
- GUI::JSSyntaxHighlighter
This not only caused repetition of most token types in each highlighter
but also a lot of inconsistency regarding the styling of certain tokens:
- JSSyntaxHighlighter was considering TokenType::Period to be an
operator whereas MarkupGenerator categorized it as punctuation.
- MarkupGenerator was considering TokenType::{Break,Case,Continue,
Default,Switch,With} control keywords whereas JSSyntaxHighlighter just
disregarded them
- MarkupGenerator considered some future reserved keywords invalid and
others not. JSSyntaxHighlighter and js disregarded most
Adding a new token type meant adding it to ENUMERATE_JS_TOKENS as well
as each individual highlighter's switch/case construct.
I added a TokenCategory enum, and each TokenType is now associated to a
certain category, which the syntax highlighters then can use for styling
rather than operating on the token type directly. This also makes
changing a token's category everywhere easier, should we need to do that
(e.g. I decided to make TokenType::{Period,QuestionMarkPeriod}
TokenCategory::Operator for now, but we might want to change them to
Punctuation.
Each JS global object has its own "console", so it makes more sense to
store it in GlobalObject.
We'll need some smartness later to bundle up console messages from all
the different frames that make up a page later, but this works for now.
More work on decoupling the general runtime from Interpreter. The goal
is becoming clearer. Interpreter should be one possible way to execute
code inside a VM. In the future we might have other ways :^)
This patch moves the exception state, call stack and scope stack from
Interpreter to VM. I'm doing this to help myself discover what the
split between Interpreter and VM should be, by shuffling things around
and seeing what falls where.
With these changes, we no longer have a persistent lexical environment
for the current global object on the Interpreter's call stack. Instead,
we push/pop that environment on Interpreter::run() enter/exit.
Since it should only be used to find the global "this", and not for
variable storage (that goes directly into the global object instead!),
I had to insert some short-circuiting when walking the environment
parent chain during variable lookup.
Note that this is a "stepping stone" commit, not a final design.
Taking a big step towards a world of multiple global object, this patch
adds a new JS::VM object that houses the JS::Heap.
This means that the Heap moves out of Interpreter, and the same Heap
can now be used by multiple Interpreters, and can also outlive them.
The VM keeps a stack of Interpreter pointers. We push/pop on this
stack when entering/exiting execution with a given Interpreter.
This allows us to make this change without disturbing too much of
the existing code.
There is still a 1-to-1 relationship between Interpreter and the
global object. This will change in the future.
Ultimately, the goal here is to make Interpreter a transient object
that only needs to exist while you execute some code. Getting there
will take a lot more work though. :^)
Note that in LibWeb, the global JS::VM is called main_thread_vm(),
to distinguish it from future worker VM's.
Now that we have a standalone test-js program, the "-t" test mode of the
js REPL is unused and can simply be removed. Required functionality has
been duplicated in test-js (isStrictMode function, loading of testing
utilities).
Also remove outdated information about tests from the js(1) man page.
More work towards supporting multiple global objects. Native C++ code
now get a GlobalObject& and don't have to ask the Interpreter for it.
I've added macros for declaring and defining native callbacks since
this was pretty tedious and this makes it easier next time we want to
change any of these signatures.
The interpreter now has an "underscore is last value" flag, which makes
Interpreter::get_variable() return the last value if:
- The m_underscore_is_last_value flag is enabled
- The name of the variable lookup is "_"
- The result of that lookup is an empty value
That means "_" can still be used as a regular variable and will stop
doing its magic once anything is assigned to it.
Example REPL session:
> 1
1
> _ + _
2
> _ + _
4
> _ = "foo"
"foo"
> 1
1
> _
"foo"
> delete _
true
> 1
1
> _
1
>
This adds regex parsing/lexing, as well as a relatively empty
RegExpObject. The purpose of this patch is to allow the engine to not
get hung up on parsing regexes. This will aid in finding new syntax
errors (say, from google or twitter) without having to replace all of
their regexes first!
This adds a `global` property to the global object so that
`global == this` when typed into the REPL.
This matches what Node.js does and helps with testing on existing
JS code that uses `global` to detect its environment.
Adds the ability for a scope (either a function or the entire program)
to be in strict mode. Scopes default to non-strict mode.
There are two ways to determine the strict-ness of the JS engine:
1. In the parser, this can be accessed with the parser_state variable
m_is_strict_mode boolean. If true, the Parser is currently parsing in
strict mode. This is done so that the Parser can generate syntax
errors at parse time, which is required in some cases.
2. With Interpreter.is_strict_mode(). This allows strict mode checking
at runtime as opposed to compile time.
Additionally, in order to test this, a global isStrictMode() function
has been added to the JS ReplObject under the test-mode flag.
This patch adds an IndexedProperties object for storing indexed
properties within an Object. This accomplishes two goals: indexed
properties now have an associated descriptor, and objects now gracefully
handle sparse properties.
The IndexedProperties class is a wrapper around two other classes, one
for simple indexed properties storage, and one for general indexed
property storage. Simple indexed property storage is the common-case,
and is simply a vector of properties which all have attributes of
default_attributes (writable, enumerable, and configurable).
General indexed property storage is for a collection of indexed
properties where EITHER one or more properties have attributes other
than default_attributes OR there is a property with a large index (in
particular, large is '200' or higher).
Indexed properties are now treated relatively the same as storage within
the various Object methods. Additionally, there is a custom iterator
class for IndexedProperties which makes iteration easy. The iterator
skips empty values by default, but can be configured otherwise.
Likewise, it evaluates getters by default, but can be set not to.
Previously, the Object class had many different types of functions for
each action. For example: get_by_index, get(PropertyName),
get(FlyString). This is a bit verbose, so these methods have been
shortened to simply use the PropertyName structure. The methods then
internally call _by_index if necessary. Note that the _by_index
have been made private to enforce this change.
Secondly, a clear distinction has been made between "putting" and
"defining" an object property. "Putting" should mean modifying a
(potentially) already existing property. This is akin to doing "a.b =
'foo'".
This implies two things about put operations:
- They will search the prototype chain for setters and call them, if
necessary.
- If no property exists with a particular key, the put operation
should create a new property with the default attributes
(configurable, writable, and enumerable).
In contrast, "defining" a property should completely overwrite any
existing value without calling setters (if that property is
configurable, of course).
Thus, all of the many JS objects have had any "put" calls changed to
"define_property" calls. Additionally, "put_native_function" and
"put_native_property" have had their "put" replaced with "define".
Finally, "put_own_property" has been made private, as all necessary
functionality should be exposed with the put and define_property
methods.
- initializing m_line_column to 1 in the lexer results in incorrect
column values in tokens on the first line of input.
- not incrementing m_line_column when EOF is reached results in
an incorrect column value on the last token.
This fixes a bunch of FIXME's in LibLine.
Also handles the case where read() would read zero bytes in vt_dsr() and
effectively block forever by erroring out.
Fixes#2370
LibLine should ultimately not care about what a "token" means in the
context of its user, so force the user to split the buffer itself.
This also allows the users to pick up contextual clues as well, since
they have to lex the line themselves.
This commit pacthes Shell and the JS repl to better handle completions,
so certain wrong behaviours are now corrected as well:
- JS repl can now complete "Object . getOw<tab>"
- Shell can now complete "echo | ca<tab>" and paths inside strings
This patch is unfortunately rather large and might make some things feel
bloated, but it is necessary to fix a few flaws in LibJS, primarily
blindly coercing values to numbers without exception checks - i.e.
interpreter.argument(0).to_i32(); // can fail!!!
Some examples where the interpreter would actually crash:
var o = { toString: () => { throw Error() } };
+o;
o - 1;
"foo".charAt(o);
"bar".repeat(o);
To fix this, we now have the following...
to_double(Interpreter&)
to_i32()
to_i32(Interpreter&)
to_size_t()
to_size_t(Interpreter&)
...and a whole lot of exception checking.
There's intentionally no to_double(), use as_double() directly instead.
This way we still can use these convenient utility functions but don't
need to check for exceptions if we are sure the value already is a
number.
Fixes#2267.
Passing a Heap& to it only to then call interpreter() on that is weird.
Let's just give it the Interpreter& directly, like some of the other
to_something() functions.
There are now two API's on Value:
- Value::to_string(Interpreter&) -- may throw.
- Value::to_string_without_side_effects() -- will never throw.
These are some pretty big sweeping changes, so it's possible that I did
some part the wrong way. We'll work it out as we go. :^)
Fixes#2123.
Giving the lexer the ability to generate errors adds unnecessary
complexity - also it only calls its syntax_error() function in one place
anyway ("unterminated string literal"). But since the lexer *also* emits
tokens like Eof or UnterminatedStringLiteral, it should be up to the
consumer of these tokens to decide what to do.
Also remove the option to not print errors to stderr as that's not
relevant anymore.
Adds fully functioning template literals. Because template literals
contain expressions, most of the work has to be done in the Lexer rather
than the Parser. And because of the complexity of template literals
(expressions, nesting, escapes, etc), the Lexer needs to have some
template-related state.
When entering a new template literal, a TemplateLiteralStart token is
emitted. When inside a literal, all text will be parsed up until a '${'
or '`' (or EOF, but that's a syntax error) is seen, and then a
TemplateLiteralExprStart token is emitted. At this point, the Lexer
proceeds as normal, however it keeps track of the number of opening
and closing curly braces it has seen in order to determine the close
of the expression. Once it finds a matching curly brace for the '${',
a TemplateLiteralExprEnd token is emitted and the state is updated
accordingly.
When the Lexer is inside of a template literal, but not an expression,
and sees a '`', this must be the closing grave: a TemplateLiteralEnd
token is emitted.
The state required to correctly parse template strings consists of a
vector (for nesting) of two pieces of information: whether or not we
are in a template expression (as opposed to a template string); and
the count of the number of unmatched open curly braces we have seen
(only applicable if the Lexer is currently in a template expression).
TODO: Add support for template literal newlines in the JS REPL (this will
cause a syntax error currently):
> `foo
> bar`
'foo
bar'