Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Events adding #7

Open
Youka opened this issue Oct 10, 2014 · 8 comments
Open

Events adding #7

Youka opened this issue Oct 10, 2014 · 8 comments

Comments

@Youka
Copy link
Contributor

Youka commented Oct 10, 2014

On the way making everything safer, calling with invalid event indices shouldn't cause crashes.
Checking the index and returning an error code on failure would be a fix, but i would prefer a stack for events storing, frees users from initialization, a final number of events and passing indices.

@torque
Copy link
Contributor

torque commented Oct 11, 2014

Data structures are kind of a weak point for me. How would you implement a stack? I was thinking maybe a linked list, but this is simplistic and probably suboptimal.

I definitely agree that the init method should be done away with. I would like to hear your thoughts on the backing structure.

@Youka
Copy link
Contributor Author

Youka commented Oct 11, 2014

Stack implementation

@torque
Copy link
Contributor

torque commented Oct 11, 2014

So that's a yes to the linked list, then. I was initially skeptical because seeking through a linked list isn't exactly fast, and there are scripts out there with thousands of events. That said, there's theoretically no good reason to do random seeking through events, so I'm fine with it.

I think a FIFO queue should probably be used because it's likely users will want to access events in the order they're used.

@Youka
Copy link
Contributor Author

Youka commented Oct 11, 2014

FIFO wouldn't be better, because data aren't stored together like in arrays, so there's no advantage in caching or read pointer movement. Additionally FIFO would require a pointer to the end for fast element appending - all in all, a stack is the choice.

An alternative would be a vector: a wrapped array which double his memory size if a new element doesn't fit. This could lead to slower building and some memory wasting, but has the fastest memory access.

@torque
Copy link
Contributor

torque commented Oct 11, 2014

FIFO wouldn't be better from an implementation standpoint, but it'd almost certainly be better from a usage standpoint. As I said, I expect users would probably want to access events in the order that they're added, for which a FILO queue is not useful.

Using a vector is ok, but we'd need to decide on a reasonable default size.

@Youka
Copy link
Contributor Author

Youka commented Oct 11, 2014

I don't see the problem with my stack implementation, there's a getter function for overall access. A FIFO would just search in another direction for the wished element, nothing else. You think of destroying the list after building it, popping one element out after another, but that's not what we want.

A vector starts with size 0, allocates for the first element 1, then doubles and doubles and so on. Allocating large memory chunks / finding big gaps in process heap is slow and with a lot of data, a lot of memory could be wasted (f.e. with 1025 elements used, 1023 would be garbage).

@torque
Copy link
Contributor

torque commented Oct 12, 2014

I've given you commit access, as I don't have the time or energy to spend on this right now.

A few comments: your stack implementation is fine. But if you have a script with 100k events (not impossible with karaoke and lots of frame-by-frame typesetting), in order to access the first event, you have to loop through 99.999k stack nodes just to get to the first one. Since, as I am now saying for the 3rd time, I expect users will want to access their data in the order they add it, this seems very suboptimal.

By simply keeping separate read and write pointers (read pointing to the beginning of the list, write pointing to the end of the list) and changing each node to link to the next rather than the previous, the user would be able to iterate through the list in add order without incurring the overhead of having to seek all the way back through it for early reads.

But maybe I'm just overthinking it and the read order doesn't matter that much.

Also, re: vectors: there's no reason it has to start with a size of zero. There are guaranteed to be some events, otherwise there's no point in even using the library. Wasting time reallocating very small amounts of memory is silly. Off the top of my head I'd suggest maybe 500 or so (which I would expect to fit the majority of scripts with dialogue and moderate typesetting).

Additionally, the growth rate does not have to be by factors of two. It's been suggested a growth factor of 1.5 can be more efficient because it allows reuse of previously freed memory (stack overflow).

As a final note, I find it curious you use pastebin.com when github has a (far superior, in my opinion) pastebin of its own.

Hopefully my feedback has been helpful.

@Youka
Copy link
Contributor Author

Youka commented Oct 12, 2014

The growth factor of 2 was just an example, but the article is right that with 1.5 the memory hole in front the used memory can get reused (which can lead to less memory reservation for a process)... however this includes there's just this one memory chunk and no other allocations which could use the hole too and 1.5 needs more reallocations for a lot of elements than 2 - advantages and disadvantages.

I didn't consider huge scripts and you're right that linked lists are too slow in access, so i'll go with a vector implementation. But vectors should be flexible and not waste a lot of memory when there're tiny ones.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants