Skip to content

Commit

Permalink
Default to re2 parser is available
Browse files Browse the repository at this point in the history
After benchmarking, the results are out, at least on the current
sample file:

First, re2 is ridiculously faster than the basic parser, even with
tons of caching. re2 does benefit from caching, but it's so fast that
it needs very high hitrates (so a very large cache) for the caching to
have a real impact, it's fast enough that at low hitrates (small
sizes) the cache does slow down parsing visibly which is not the case
of the basic parser.

Second, LRU is confirmed to be a better cache replacement policy than
clearing (which... duh), it's not super sensible at very low sizes but
at 100 entries it starts really pulling ahead, so definitely the
better default at 200 (where even with the overhead of the more
layered approach it's ahead of the legacy parser and its immutable 20
entries clearing cache).

The locking doesn't seem to have much impact without contention, and
even contended the LRU seems to behave way better than the clearing
cache still. So fallback onto locked LRU if re2 is not available.
  • Loading branch information
masklinn committed Feb 11, 2024
1 parent 9960dbd commit fa27574
Showing 1 changed file with 14 additions and 5 deletions.
19 changes: 14 additions & 5 deletions src/ua_parser/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@

VERSION = (1, 0, 0)

from typing import Optional
import contextlib
from typing import Callable, Optional, Type

from .core import (
DefaultedParseResult,
Device,
Expand All @@ -65,17 +67,24 @@
from .caching import CachingParser, Clearing, LRU, Locking
from .loaders import load_builtins, load_data, load_yaml

Re2Parser: Optional[Callable[[Matchers], Parser]] = None
with contextlib.suppress(ImportError):
from .re2 import Parser as Re2Parser


parser: Parser


def __getattr__(name: str) -> Parser:
global parser
if name == "parser":
parser = CachingParser(
BasicParser(load_builtins()),
LRU(200),
)
if Re2Parser is not None:
parser = Re2Parser(load_builtins())
else:
parser = CachingParser(
BasicParser(load_builtins()),
Locking(LRU(200)),
)
return parser
raise AttributeError(f"module {__name__!r} has no attribute {name!r}")

Expand Down

0 comments on commit fa27574

Please sign in to comment.