Currently, integers are stored in LibSQL as 32-bit signed integers, even
if the provided type is unsigned. This resulted in a series of unchecked
unsigned-to-signed conversions, and prevented storing 64-bit values.
Further, mathematical operations were performed without similar checks,
and without checking for overflow.
This changes SQL::Value to behave like SQLite for INTEGER types. In
SQLite, the INTEGER type does not imply a size or signedness of the
underlying type. Instead, SQLite determines on-the-fly what type is
needed as values are created and updated.
To do so, the SQL::Value variant can now hold an i64 or u64 integer. If
a specific type is requested, invalid conversions are now explictly an
error (e.g. converting a stored -1 to a u64 will fail). When binary
mathematical operations are performed, we now try to coerce the RHS
value to a type that works with the LHS value, failing the operation if
that isn't possible. Any overflow or invalid operation (e.g. bitshifting
a 64-bit value by more than 64 bytes) is an error.
This partially implements SQLite's bind-parameter expression to support
indicating placeholder values in a SQL statement. For example:
INSERT INTO table VALUES (42, ?);
In the above statement, the '?' identifier is a placeholder. This will
allow clients to compile statements a single time while running those
statements any number of times with different placeholder values.
Further, this will help mitigate SQL injection attacks.
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.
One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
Database::get_table currently either returns a RefPtr to an existing
table, a nullptr if the table doesn't exist, or an Error if some
internal error occured. Change this to return a NonnullRefPtr to an
exisiting table, or a SQL::Result with any error, including if the
table was not found. Callers can then handle that specific error code
if they want.
Returning a NonnullRefPtr will enable some further cleanup. This had
some fallout of needing to change some other methods' return types from
AK::ErrorOr to SQL::Result so that TRY may continue to be used.
Database::get_schema currently either returns a RefPtr to an existing
schema, a nullptr if the schema doesn't exist, or an Error if some
internal error occured. Change this to return a NonnullRefPtr to an
exisiting schema, or a SQL::Result with any error, including if the
schema was not found. Callers can then handle that specific error code
if they want.
Returning a NonnullRefPtr will enable some further cleanup. This had
some fallout of needing to change some other methods' return types from
AK::ErrorOr to SQL::Result so that TRY may continue to be used.
After splitting a node, the new node was written to the same pointer as
the current node - probably a copy / paste error. This new code requires
a `.pointer() -> u32` to exist on the object to be serialized,
preventing this issue from happening again.
Fixes#15844.
Now that expression evaluation can use TRY, we can allow binary operator
methods to fail as well. This also fixes a few instances of converting a
Value to a double when we meant to convert to an integer.
The result of a SQL statement execution is either:
1. An error.
2. The list of rows inserted, deleted, selected, etc.
(2) is currently represented by a combination of the Result class and
the ResultSet list it holds. This worked okay, but issues start to
arise when trying to use Result in non-statement contexts (for example,
when introducing Result to SQL expression execution).
What we really need is for Result to be a thin wrapper that represents
both (1) and (2), and to not have any explicit members like a ResultSet.
So this commit removes ResultSet from Result, and introduces ResultOr,
which is just an alias for AK::ErrorOrr. Statement execution now returns
ResultOr<ResultSet> instead of Result. This further opens the door for
expression execution to return ResultOr<Value> in the future.
Lastly, this moves some other context held by Result over to ResultSet.
This includes the row count (which is really just the size of ResultSet)
and the command for which the result is for.
The implementation of LIKE uses regexes under the hood, and this
implementation of REGEXP takes the same approach. It employs
PosixExtended from LibRegex with case insensitive and Unicode flags
set. The implementation of LIKE is based on SQLlite specs, but SQLlite
does not offer directions for a built-in regex functionality, so this
one uses LibRegex.
Ordering is done by replacing the straight Vector holding the query
result in the SQLResult object with a dedicated Vector subclass that
inserts result rows according to their sort key using a binary search.
This is done in the ResultSet class.
There are limitations:
- "SELECT ... ORDER BY 1" (or 2 or 3 etc) is supposed to sort by the
n-th result column. This doesn't work yet
- "SELECT ... column-expression alias ... ORDER BY alias" is supposed to
sort by the column with the given alias. This doesn't work yet
What does work however is something like
```SELECT foo FROM bar SORT BY quux```
i.e. sorted by a column not in the result set. Once functions are
supported it should be possible to sort by random functions.
Fixes a crash that was caused by a syntax error which is difficult to
catch by the parser: usually identifiers are accepted in column lists,
but they are not in a list of column values to be inserted in an INSERT.
Fixed this by putting in a heuristic check; we probably need a better
way to do this.
Included tests for this case.
Also introduced a new SQL Error code, `NotYetImplemented`, and return
that instead of crashing when encountering unimplemented SQL.
The handling of filesystem level errors was basically non-existing or
consisting of `VERIFY_NOT_REACHED` assertions. Addressed this by
* Adding `open` methods to `Heap` and `Database` which return errors.
* Changing the interface of methods of these classes and clients
downstream to propagate these errors.
The constructors of `Heap` and `Database` don't open the underlying
filesystem file anymore.
The SQL statement handlers return an `SQLErrorCode::InternalError`
error code if an error comes back from the lower levels. Note that some
of these errors are things like duplicate index entry errors that should
be caught before the SQL layer attempts to actually update the database.
Added tests to catch attempts to open weird or non-existent files as
databases.
Finally, in between me writing this patch and submitting the PR the
AK::Result<Foo, Bar> template got deprecated in favour of ErrorOr<Foo>.
This resulted in more busywork.
This patch introduces table joins. It uses a pretty dumb algorithm-
starting with a singleton '__unity__' row consisting of a single boolean
value, a cartesian product of all tables in the 'FROM' clause is built.
This cartesian product is then filtered through the 'WHERE' clause,
again without any smarts just using brute force.
This patch required a bunch of busy work to allow for example the
ColumnNameExpression having to deal with multiple tables potentially
having columns with the same name.
Because SQL is the craptastic language that it is, sometimes expressions
need to know details about the calling statement. For example the tables
in the 'FROM' clause may be needed to determine which columns are
referenced in 'WHERE' expressions. So the current statement is added
to the ExecutionContext and a new 'execute' overload on Statement is
created which takes the Database and the Statement and builds an
ExecutionContaxt from those.
Command used:
grep -Pirn '(out|warn)ln\((?!["\)]|format,|stderr,|stdout,|output, ")' \
AK Kernel/ Tests/ Userland/
(Plus some manual reviewing.)
Let's pick ArgsParser as an example:
outln(file, m_general_help);
This will fail at runtime if the general help happens to contain braces.
Even if this transformation turns out to be unnecessary in a place or
two, this way the code is "more obviously" correct.
This patch provides very basic, bare bones implementations of the
INSERT and SELECT statements. They are *very* limited:
- The only variant of the INSERT statement that currently works is
SELECT INTO schema.table (column1, column2, ....) VALUES
(value11, value21, ...), (value12, value22, ...), ...
where the values are literals.
- The SELECT statement is even more limited, and is only provided to
allow verification of the INSERT statement. The only form implemented
is: SELECT * FROM schema.table
These statements required a bit of change in the Statement::execute
API. Originally execute only received a Database object as parameter.
This is not enough; we now pass an ExecutionContext object which
contains the Database, the current result set, and the last Tuple read
from the database. This object will undoubtedly evolve over time.
This API change dragged SQLServer::SQLStatement into the patch.
Another API addition is Expression::evaluate. This method is,
unsurprisingly, used to evaluate expressions, like the values in the
INSERT statement.
Finally, a new test file is added: TestSqlStatementExecution, which
tests the currently implemented statements. As the number and flavour of
implemented statements grows, this test file will probably have to be
restructured.