-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse search strings as byteseek regexes, or just convert to byte arrays? #19
Comments
I guess those constructors just throw CompileExceptions. It only affects you if you use those constructors - as it should. |
This is essentially a convenience constructor. Either a string is a byte array, or it's a regex. In either case we already have constructors for the outputs (SequenceMatcher or byte array). And the byte array can also be modelled by e a SequenceMatcher. What gives the best convenience? |
So - SequenceMatcher constructor is the only general constructor for SequenceSearch algorithms. |
Downside of making String constructors for search algorithms process regexes, is it creates a hard dependency on all search algorithms to the byteseek sequence matcher compiler and regex parser. |
Currently, matchers and searchers don't depend on the parser and compiler in any way. |
The only excuse for such a higher level dependency is convenience - which is what this is. Is the convenience of instantiating hex string (or more complex syntax) searchers directly worth the dependency it creates? |
I don't think a hard dependency between the Searcher and Compiler package will really hurt anything. It's a general design principle to try to keep them as cleanly separated as possible, but this is a case where we already had a support question raised by a user. They expected (or wanted) to be able to do this. |
I'm going to explore using SequenceMatcher compilers directly in the SequeneSearcher String constructors. |
Search algorithms usually provide a String constructor.
Currently, that just lets us match the sequence of bytes encoded by that string (either in the default Charset, or one that's provided).
But since byteseek is byte oriented, a hex string (or full byteseek regex syntax) might be more useful. One person has already asked whether they can pass in hex string bytes directly to the search algorithms. I had to say that wasn't true, and they needed to use the SequenceMatcherCompiler to create a SequenceMatcher for the search algorithm to look for.
The same search algorithms also have a byte[] array constructor, if you want to explicitly search for a byte pattern. It's easy to convert a string to a byte array if that's what needs to be searched for.
So I guess the string constructor is essentially redundant - unless we support byteseek regex construction directly, just get rid of those constructors.
If we did support byteseek regex syntax in the search algorithm String constructors, what do we do with Compiler / Parse Exceptions?
The text was updated successfully, but these errors were encountered: