Commit Graph

90 Commits

Author SHA1 Message Date
1f52a5097c feat(db/katana): don't crash if the query failed 2024-02-10 21:45:11 +07:00
8e2a629427 fix(tesseract/regex): fix wrong length of string because of unicode
Yeah.
2024-02-10 21:44:44 +07:00
c72d2cf16b fix(tesseract/regex): support unicode characters
Mostly from the first part of regexify_text
2024-02-10 11:17:56 +07:00
5ae36d7f2a fix(tesseract/regex): add workaround for é, á and d 2024-02-01 00:08:32 +07:00
e598e34d45 fix(tesseract): add [oc] to regex generation 2024-01-31 23:37:36 +07:00
1a68d9b177 fix(katana/parser): remove -PRINT] part from name
fk calf
2024-01-31 23:34:53 +07:00
07acc53e07 fix(regex): break immediately after appending the last character 2024-01-31 01:48:19 +07:00
5facb83ce7 Revert "fix(regex): break immediately after appending the last character"
This reverts commit bc4dcad932.

Edited to retain the fix in the commit
2024-01-31 01:44:59 +07:00
588e50af33 feat(db/katana): introduce optimized aggregation query functions
But we still use the old one because it's good enough, and i'm too lazy to rewrite that part :)
2024-01-31 01:27:03 +07:00
bc4dcad932 fix(regex): break immediately after appending the last character 2024-01-31 01:25:58 +07:00
4274e8539a fix(katana): lower short string & append last character 2024-01-25 01:15:44 +07:00
b4888c1d72 fix(katana): add y to [uv] regex 2024-01-24 12:47:26 +07:00
3929b509b2
Merge pull request #2 from teppyboy/dependabot/cargo/h2-0.3.24
chore(deps): bump h2 from 0.3.22 to 0.3.24
2024-01-24 10:12:30 +07:00
dependabot[bot]
2b33fca73f
chore(deps): bump h2 from 0.3.22 to 0.3.24
Bumps [h2](https://github.com/hyperium/h2) from 0.3.22 to 0.3.24.
- [Release notes](https://github.com/hyperium/h2/releases)
- [Changelog](https://github.com/hyperium/h2/blob/v0.3.24/CHANGELOG.md)
- [Commits](https://github.com/hyperium/h2/compare/v0.3.22...v0.3.24)

---
updated-dependencies:
- dependency-name: h2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-23 17:54:13 +00:00
4b69c8a990
Merge pull request #1 from teppyboy/dependabot/cargo/shlex-1.3.0
chore(deps): bump shlex from 1.2.0 to 1.3.0
2024-01-24 00:53:44 +07:00
a317d4f28b chore: add regexify-text debug command 2024-01-24 00:51:26 +07:00
9609e1b217 fix(katana): proper check for ascii alphanumeric 2024-01-24 00:34:44 +07:00
43a869660f fix(tesseract): catches panic in init thread 2024-01-23 23:40:46 +07:00
dependabot[bot]
547a6457f8
chore(deps): bump shlex from 1.2.0 to 1.3.0
Bumps [shlex](https://github.com/comex/rust-shlex) from 1.2.0 to 1.3.0.
- [Changelog](https://github.com/comex/rust-shlex/blob/master/CHANGELOG.md)
- [Commits](https://github.com/comex/rust-shlex/commits)

---
updated-dependencies:
- dependency-name: shlex
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-22 21:54:53 +00:00
093503cd97 fix: fix a bug in previous commit
Fk Rust
2024-01-21 22:41:14 +07:00
8ed238455d fix(katana): a -> [ao] in regex
I plan to move to ScyllaDB once Mongo gets too slow
2024-01-21 22:38:22 +07:00
1933a1fda2 fix(katana): remove last X and x in regex query 2024-01-19 21:16:05 +07:00
d7b9540004 fix(katana): wrong method 💀 2024-01-17 02:26:04 +07:00
744b1a9065 fix(katana): do not add word boundary if we detect a special character 2024-01-17 02:15:23 +07:00
d5ab4169f0 fix(katana): regex 2024-01-14 19:37:30 +07:00
389efeac0d fix(katana): implement regex for short strings
Pls work
2024-01-14 18:27:48 +07:00
a3b247bcb9 fix(katana): handle calf error "?" wishlist 2024-01-14 17:06:45 +07:00
4f8a29230f fix(katana): add [a-z0-9] to regex
Because we defined word boundary
2024-01-13 02:52:10 +07:00
0ab5c23a3b fix(katana): for real 2024-01-12 19:03:13 +07:00
d13e26c076 fix(katana): ah shit regex 2024-01-12 18:48:47 +07:00
0f97ec8810 ci: install rust nightly 2024-01-12 18:30:27 +07:00
b1ebd75c88 chore: install tesseract first 2024-01-12 18:27:17 +07:00
36f3cb76c3
chore: create rust.yml 2024-01-12 18:21:54 +07:00
8c581348c1 chore: add git hash to info command 2024-01-12 18:15:30 +07:00
50c4b69eb5 fix(katana): add more workaround & partial match 2024-01-12 18:14:50 +07:00
6c3c60b141 fix(katana): aaa 2024-01-11 22:19:08 +07:00
cce8dedccf fix(katana): a chain of Ok Err 2024-01-11 22:18:39 +07:00
17dd04645a chore: use autosharded 2024-01-11 22:15:37 +07:00
4341695c74 fix(katana): index out of range
Also handle some error instead of unwrapping it
2024-01-11 22:09:18 +07:00
cde01d45d7 fix(katana): add more workaround to regex
💀
2024-01-10 23:05:32 +07:00
d46e0f8e6f fix(katana): add more workarounds 2024-01-10 22:59:34 +07:00
8c1e39708f fix(katana): only regex if the word length is > 2 2024-01-10 15:32:30 +07:00
f13d50e0f6 chore: remove log level restriction in debug commands 2024-01-10 12:44:58 +07:00
952467d4b1 chore(katana): move regex building to swordfish
Welp, I can add my dirty workaround now
2024-01-10 12:31:29 +07:00
5accadf277 fix(katana): algo 2024-01-10 06:30:37 +07:00
ba0bdfee37 fix(katana): for real
Basically a partial revert of 6f35d05a3e and also using char instead
2024-01-10 06:20:32 +07:00
5d614df9f9 fix(katana): fix broken text algorithm 2024-01-10 06:16:10 +07:00
2b6dc03040 feat(katana): implement character parsing from Calf's analysis
Also fix bug in kc o:w parsing
2024-01-10 02:09:34 +07:00
6f35d05a3e fix(katana): improve the image analyzing process
Zamn, decreasing contrast and also copying Nori code.
2024-01-10 01:24:22 +07:00
1b943f5698 fix(katana): add () to allowed chars 2024-01-08 23:45:46 +07:00