Commit graph

59 commits

Author SHA1 Message Date
Timothy Flynn
60782cb219 AK: Do not coerce i64 and u64 values to i32 and u32
First, this isn't actually helpful, as we no longer store 32-bit values
in JsonValue. They are stored as 64-bit values anyways.

But more imporatantly, there was a bug here when trying to coerce an i64
to an i32. All negative values were cast to an i32, without checking if
the value is below NumericLimits<i32>::min.

(cherry picked from commit 7b3b608cafbed8049ac7a34104c66622c1445ffc)
2024-11-14 19:52:23 -05:00
Timothy Flynn
8adcbd7136 AK: Decode paired UTF-16 surrogates in a JSON string
For example, such use is seen on Twitter.

(cherry picked from commit 698a95d2dee0ba7e9a3f1c39af5459ba506c445f)
2024-07-07 18:47:09 +02:00
kleines Filmröllchen
eada4f2ee8 AK: Remove ByteString from GenericLexer
A bunch of users used consume_specific with a constant ByteString
literal, which can be replaced by an allocation-free StringView literal.

The generic consume_while overload gains a requires clause so that
consume_specific("abc") causes a more understandable and actionable
error.
2024-01-12 17:03:53 -07:00
Shannon Booth
e2e7c4d574 Everywhere: Use to_number<T> instead of to_{int,uint,float,double}
In a bunch of cases, this actually ends up simplifying the code as
to_number will handle something such as:

```
Optional<I> opt;
if constexpr (IsSigned<I>)
    opt = view.to_int<I>();
else
    opt = view.to_uint<I>();
```

For us.

The main goal here however is to have a single generic number conversion
API between all of the String classes.
2023-12-23 20:41:07 +01:00
Ali Mohammad Pur
5e1499d104 Everywhere: Rename {Deprecated => Byte}String
This commit un-deprecates DeprecatedString, and repurposes it as a byte
string.
As the null state has already been removed, there are no other
particularly hairy blockers in repurposing this type as a byte string
(what it _really_ is).

This commit is auto-generated:
  $ xs=$(ack -l \bDeprecatedString\b\|deprecated_string AK Userland \
    Meta Ports Ladybird Tests Kernel)
  $ perl -pie 's/\bDeprecatedString\b/ByteString/g;
    s/deprecated_string/byte_string/g' $xs
  $ clang-format --style=file -i \
    $(git diff --name-only | grep \.cpp\|\.h)
  $ gn format $(git ls-files '*.gn' '*.gni')
2023-12-17 18:25:10 +03:30
Hendiadyoin1
e02a4f5181 AK: Bring JsonParser's string consumption closer to the ECMA 404 spec
I added some spec comments, and implementation notices, this should not
change behavior in a significant way.

The previous code was quite unwieldy and repetitive.
The long `if(next_is('X'))` chain is now a smaller `switch`.
I also reinstated the fast path for long sequences of literal
characters, which was broken in 0aad21fff2
2023-11-01 17:28:19 -06:00
Timothy Flynn
f630a5ca71 AK+LibJS: Remove error-prone JsonValue constructor
Consider the following:

    JsonValue value { JsonValue::Type::Object };
    value.as_object().set("foo"sv, "bar"sv);

The JsonValue(Type) constructor does not initialize the underlying union
that stores its value. Thus JsonValue::as_object() will A) refer to an
uninitialized union member, B) deference that member.

This constructor only has 2 users, both of which initialize the type to
Type::Null. Rather than implementing unused functionality here, replace
those uses with the default JsonValue constructor, and remove the faulty
constructor.
2023-11-01 11:15:26 -04:00
Ali Mohammad Pur
aeee98b3a1 AK+Everywhere: Remove the null state of DeprecatedString
This commit removes DeprecatedString's "null" state, and replaces all
its users with one of the following:
- A normal, empty DeprecatedString
- Optional<DeprecatedString>

Note that null states of DeprecatedFlyString/StringView/etc are *not*
affected by this commit. However, DeprecatedString::empty() is now
considered equal to a null StringView.
2023-10-13 18:33:21 +03:30
Cameron Youell
ada6dbf636 AK: Use JsonArray::append when parsing array 2023-04-24 09:21:51 +02:00
Cameron Youell
8134dccdc7 AK: Add new failable JsonArray::{append/set} functions
Move all old usages to the more explicit `JsonArray:must_{append/set}`
2023-04-24 09:21:51 +02:00
Linus Groh
57dc179b1f Everywhere: Rename to_{string => deprecated_string}() where applicable
This will make it easier to support both string types at the same time
while we convert code, and tracking down remaining uses.

One big exception is Value::to_string() in LibJS, where the name is
dictated by the ToString AO.
2022-12-06 08:54:33 +01:00
Linus Groh
6e19ab2bbc AK+Everywhere: Rename String to DeprecatedString
We have a new, improved string type coming up in AK (OOM aware, no null
state), and while it's going to use UTF-8, the name UTF8String is a
mouthful - so let's free up the String name by renaming the existing
class.
Making the old one have an annoying name will hopefully also help with
quick adoption :^)
2022-12-06 08:54:33 +01:00
davidot
c9aa664eb0 AK: Make the JsonParser use the new double parser for numbers
Because we still support u64 and i64 (on top of i32 and u32) we do still
have to parse the number ourself first. Then if we determine that the
number is a floating point or is outside of the range of i64 and u64 we
fallback and parse it as a double.

Before JsonParser had ifdefs guarding the double computation, but it
just build when we error on ifdef KERNEL so JsonParser is no longer
usable in the Kernel. This can be remedied fairly easily but since
it is not needed we #error on that for now.
2022-10-23 15:48:45 +02:00
davidot
75ebcf6b4a AK: Allow exponents in JSON double values
This is required for ECMA-404 compliance, but probably not for serenity
itself.
2022-09-02 02:07:37 +01:00
sin-ack
e5f09ea170 Everywhere: Split Error::from_string_literal and Error::from_string_view
Error::from_string_literal now takes direct char const*s, while
Error::from_string_view does what Error::from_string_literal used to do:
taking StringViews. This change will remove the need to insert `sv`
after error strings when returning string literal errors once
StringView(char const*) is removed.

No functional changes.
2022-07-12 23:11:35 +02:00
serenitydev
23c72c6728 AK: Fix userland parsing of rounded floating point numbers
Parse JSON floating point literals properly,
No longer throwing a SyntaxError when the decimal portion
of the number exceeds the capacity of u32.

Added tests to AK/TestJSON and LibJS/builtins/JSON/JSON.parse
2022-02-16 07:22:51 -05:00
ForLoveOfCats
1d95fd5443 AK: Identify negative zero when parsing Json and represent with a double
Regardless of the backing type that the number would otherwise parse to,
if it is zero and the sign was supposed to be negative then it needs to
be a floating point number to represent the correct value.
2022-01-19 21:51:09 +00:00
Andreas Kling
587f9af960 AK: Make JSON parser return ErrorOr<JsonValue> (instead of Optional)
Also add slightly richer parse errors now that we can include a string
literal with returned errors.

This will allow us to use TRY() when working with JSON data.
2021-11-17 00:21:10 +01:00
Brian Gianforcaro
14eb736e22 AK: Reduce the scope of fraction_string to where it's needed
pvs-studio flagged this a potential optimization, as we only
need to really construct the fraction_string if is_double is
true.
2021-09-16 17:17:13 +02:00
Ali Mohammad Pur
5d170810db AK: Make JsonParser correctly parse unsigned values larger than u32
Prior to this, it'd try to stuff them into an i64, which could fail and
give us nothing.
Even though this is an extension we've made to JSON, the parser should
be able to correctly round-trip from whatever our serialiser has
generated.
2021-07-15 01:47:35 +02:00
stelar7
ce314c54bd JsonParser: Bring parser more to spec 2021-07-05 12:36:19 +02:00
Andreas Kling
fccfc33dfb AK: Use move semantics to avoid copying in JSON parser
The JSON parser was deep-copying JsonValues left and right, and it was
all totally avoidable. :^)
2021-05-14 11:54:43 +02:00
Brian Gianforcaro
1682f0b760 Everything: Move to SPDX license identifiers in all files.
SPDX License Identifiers are a more compact / standardized
way of representing file license information.

See: https://spdx.dev/resources/use/#identifiers

This was done with the `ambr` search and replace tool.

 ambr --no-parent-ignore --key-from-file --rep-from-file key.txt rep.txt *
2021-04-22 11:22:27 +02:00
Linus Groh
e265054c12 Everywhere: Remove a bunch of redundant 'AK::' namespace prefixes
This is basically just for consistency, it's quite strange to see
multiple AK container types next to each other, some with and some
without the namespace prefix - we're 'using AK::Foo;' a lot and should
leverage that. :^)
2021-02-26 16:59:56 +01:00
Ben Wiederhake
c6a42ab5c3 Everywhere: Remove unnecessary headers 4/4
Arbitrarily split up to make git bisect easier.

These unnecessary #include's were found by combining an automated tool (which
determined likely candidates) and some brain power (which decided whether
the #include is also semantically superfluous).
2021-02-08 18:03:57 +01:00
Sahan Fernando
b37139e111 AK: Make JsonParser::parse_number properly parse >32bit ints 2020-12-21 00:15:44 +01:00
Tucker Polomik
ad8284bac6 AK: check fractional string has_value() in JsonParser
Resolves #3670
2020-10-06 14:27:51 +02:00
Benoît Lormeau
f0f6b09acb AK: Remove the ctype adapters and use the actual ctype functions instead
This finally takes care of the kind-of excessive boilerplate code that were the
ctype adapters. On the other hand, I had to link `LibC/ctype.cpp` to the Kernel
(for `AK/JsonParser.cpp` and `AK/Format.cpp`). The previous commit actually makes
sense now: the `string.h` includes in `ctype.{h,cpp}` would require to link more LibC
stuff to the Kernel when it only needs the `_ctype_` array of `ctype.cpp`, and there
wasn't any string stuff used in ctype.
Instead of all this I could have put static derivatives of `is_any_of()` in the
concerned AK files, however that would have meant more boilerplate and workarounds;
so I went for the Kernel approach.
2020-09-27 21:15:25 +02:00
Benoît Lormeau
7b356c33cb
AK: Add a GenericLexer and extend the JsonParser with it (#2696) 2020-08-09 11:34:26 +02:00
Nico Weber
ce95628b7f Unicode: Try s/codepoint/code_point/g again
This time, without trailing 's'. Ran:

    git grep -l 'codepoint' | xargs sed -ie 's/codepoint/code_point/g
2020-08-05 22:33:42 +02:00
Nico Weber
19ac1f6368 Revert "Unicode: s/codepoint/code_point/g"
This reverts commit ea9ac3155d.
It replaced "codepoint" with "code_points", not "code_point".
2020-08-05 22:33:42 +02:00
Andreas Kling
ea9ac3155d Unicode: s/codepoint/code_point/g
Unicode calls them "code points" so let's follow their style.
2020-08-03 19:06:41 +02:00
LepkoQQ
b1c99c1891 AK: Fix JsonParser double encoding multibyte utf-8 chararcters 2020-06-20 17:04:03 +02:00
Hüseyin ASLITÜRK
0aad21fff2 AK: JsonParser, replace char type to u32 for code point 2020-06-16 13:15:17 +02:00
Matthew Olsson
e8e728454c AK: JsonParser improvements
- Parsing invalid JSON no longer asserts
    Instead of asserting when coming across malformed JSON,
    JsonParser::parse now returns an Optional<JsonValue>.
- Disallow trailing commas in JSON objects and arrays
- No longer parse 'undefined', as that is a purely JS thing
- No longer allow non-whitespace after anything consumed by the initial
  parse() call. Examples of things that were valid and no longer are:
    - undefineddfz
    - {"foo": 1}abcd
    - [1,2,3]4
- JsonObject.for_each_member now iterates in original insertion order
2020-06-13 12:43:22 +02:00
Andreas Kling
fdfda6dec2 AK: Make string-to-number conversion helpers return Optional
Get rid of the weird old signature:

- int StringType::to_int(bool& ok) const

And replace it with sensible new signature:

- Optional<int> StringType::to_int() const
2020-06-12 21:28:55 +02:00
Hüseyin ASLITÜRK
8e9776165f AK: JSON, Escape spacial character in string serialization 2020-06-03 21:52:40 +02:00
Tibor Nagy
1bf93d504f AK: Break on end of input in JsonParser::consume_quoted_string
Fixes #1599
2020-04-04 10:31:01 +02:00
Emanuel Sprung
c54855682c AK: A few JSON improvements
* Add double number to object serializer

* Handle negative double numbers correctly

* Handle \r and \n in quoted strings independently
  This improves the situation when keys contain \r or \n that currently
  has the effect that "a\rkey" and "a\nkey" in an JSON object are the
  same key value.
2020-03-31 13:42:39 +02:00
Andreas Kling
e34fe556ca AK: Fix JsonParser kernel build (no floats/doubles in kernel code) 2020-03-24 22:37:24 +01:00
Emanuel Sprung
bca5762542 AK: Add parsing of JSON double values
This patch adds the parsing of double values to the JSON parser.
There is another char buffer that get's filled when a "." is present
in the number parsing. When number finished, a divider is calculated
to transform the number behind the "." to the actual fraction value.
2020-03-24 22:20:07 +01:00
Andreas Kling
900f51ccd0 AK: Move memory stuff (fast memcpy, etc) to a separate header
Move the "fast memcpy" stuff out of StdLibExtras.h and into Memory.h.
This will break a ton of things that were relying on StdLibExtras.h
to include a bunch of other headers. Fix will follow immediately after.

This makes it possible to include StdLibExtras.h from Types.h, which is
the main point of this exercise.
2020-03-08 13:06:51 +01:00
Andreas Kling
22d0a6d92f AK: Remove unnecessary casts to size_t, after Vector changes
Now that Vector uses size_t, we can remove a whole bunch of redundant
casts to size_t.
2020-03-01 12:58:22 +01:00
Andreas Kling
94ca55cefd Meta: Add license header to source files
As suggested by Joshua, this commit adds the 2-clause BSD license as a
comment block to the top of every source file.

For the first pass, I've just added myself for simplicity. I encourage
everyone to add themselves as copyright holders of any file they've
added or modified in some significant way. If I've added myself in
error somewhere, feel free to replace it with the appropriate copyright
holder instead.

Going forward, all new source files should include a license header.
2020-01-18 09:45:54 +01:00
Andreas Kling
821484f170 AK: Fix JSON parser crashing when encountering UTF-8
The mechanism that caches the most recently seen string for each first
character was indexing into the cache using a 'char' subscript. Oops!
2019-12-29 22:20:21 +01:00
Andreas Kling
6f4c380d95 AK: Use size_t for the length of strings
Using int was a mistake. This patch changes String, StringImpl,
StringView and StringBuilder to use size_t instead of int for lengths.
Obviously a lot of code needs to change as a result of this.
2019-12-09 17:51:21 +01:00
Andreas Kling
1ac963b5c8 JsonParser: "" is an empty string, not a null value 2019-08-14 15:01:42 +02:00
Andreas Kling
6d97caf124 JsonParser: Scan ahead to find the first special char in quoted strings
This allows us to take advantage of the now-optimized (to do memmove())
Vector::append(const T*, int count) for collecting these strings.

This is a ~15% speedup on the load_4chan_catalog benchmark.
2019-08-07 11:57:51 +02:00
Andreas Kling
0d0230ab10 JsonParser: Fold extract_while() into parse_number()
It wasn't unsed anywhere else anyway, and this is actually ~1% faster
on the load_4chan_catalog benchmark.
2019-08-04 20:23:46 +02:00
Andreas Kling
72b69b82bb JsonParser: Oops, fix build. 2019-08-04 18:59:06 +02:00