Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode APIs should be used in functions.rb #19

Open
Iristyle opened this issue Dec 14, 2015 · 8 comments
Open

Unicode APIs should be used in functions.rb #19

Iristyle opened this issue Dec 14, 2015 · 8 comments

Comments

@Iristyle
Copy link

ANSI APIs at https://github.com/djberg96/win32-eventlog/blob/ffi2/lib/win32/windows/functions.rb should be replaced with their wide character equivalents to properly support international versions of Windows.

@djberg96
Copy link
Collaborator

I started a wide_functions branch that did a big chunk of the work, but I think the descriptions are messed up right now. Need to investigate.

@jordansissel
Copy link

jordansissel commented Aug 15, 2016

Came here after poking around on win32-eventlog trying to get it to emit utf-8. Caveat: This is my first try at doing anything with UTF-16 on Windows.

The main change was this:

windows/functions.rb: Change all ANSI functions to Unicode (FormatMessageA -> FormatMessageW, etc)

The remaining problems I encounter are:

  • buf.read_string stops reading on a NUL (I think?). I don't have a UTF-16-aware solution here because FFI::Pointer doesn't have an obvious "read until ..." construct where we'd need to read until two sequential NULs appeared.
  • evenlog.rb:839#get_description: the str.unpack('Z*', num) won't work for UTF-16 because, for example, the letter 'H' is encoded as 0x4800 and the 2nd byte being NUL is used by the Z unpack as a terminating condition. I'm using this instead.
# Split on two consecutive NUL bytes. Force the resulting strings to have encoding label set to UTF-16LE.
str.split("\0\0", num+1)[0..-2].collect { |v| v.force_encoding("UTF-16LE")

I am not completely successful on this yet, so I haven't any PR to propose. Just documenting my research on a path to Unicode event strings ;P

@evg345
Copy link

evg345 commented Feb 13, 2017

@jordansissel may be you can use MultiByteToWideChar(...) ? it`s native WinAPI function and it have to be correct with \0\0 chracters.

// logstash-input-eventlog-0.6.7 does not works with cyrilic windows :( i get messages like "\xEF\xF0\xE8\xE2\xE5\xF2" instead of "привет"

@iantalarico
Copy link
Contributor

Is this being actively worked on? The library is great but is causing some text to be improperly encoded.

@djberg96
Copy link
Collaborator

@iantalarico Not at the moment. I started working on it, but ran into some issues. I've been lazy, mostly because I don't really use this library any more.

I'll try to get back to it, but PR's are welcome.

@iantalarico
Copy link
Contributor

@djberg96 Thanks for the fast response. If I find some time I may send you a PR.

@iantalarico
Copy link
Contributor

@djberg96 What you had seemed to be almost completely done minus the va_list parsing :) . I just sent PR #23.

@juju4
Copy link

juju4 commented Sep 1, 2018

I believe I have this issue with td-agent 3.1.1/fluentd 1.0.2 and fluent-plugin-windows-eventlog 0.2.2.

any way to verify if source processing is the problem or backend?

example issue with process_information.process_command_line of eventid 4688

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants