-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
/r/demo/boards/
2 Brainstorming
#1
Comments
[23-09-2024]: Initial Thoughts for SpecsNote TL;DR:
Context:Discussions centered on how to structure boards and posts in a unified model, with posts as the base type for both. Treating boards as a type of post simplifies the architecture by allowing them to share common fields, such as slugs. Boards are differentiated by their parent-child relationships, where a board is simply a post without a parent. Posts as a Universal Structure:In this model, everything is treated as a post, including boards. Boards do not require a separate structure, as they are identified by their lack of a parent. Managing boards, posts, and replies uniformly reduces complexity and avoids the need for multiple types. Comments and sub-posts are children of a parent post, and each post can be rendered differently depending on its label or type. Note This unified structure reduces the overhead of maintaining multiple types and adds flexibility at the UI level by allowing different post behaviors. Hierarchical Slug DesignThe slug system is key to organizing the hierarchy. A slug like Using prefix searches allows efficient iteration through all replies or sub-posts. This method eliminates the need for complex joins or additional structures for parent-child queries, leveraging the slug structure to quickly identify related posts. // Example Slug System:
// "0_blog" > board
// "0_blog:1_username1" > post under blog board
// "0_blog:1_username1r:2_username2" > reply to username1's post Iterating Through Posts Using SlugsExample 1: Iteration Using Plain Slugs (courtesy of @jeronimoalbi)When using plain slugs (no ID prefixes), we rely on string operations like start := "blog"
tree.Iterate(start, "", func(slug string, _ interface{}) bool {
if !strings.HasPrefix(slug, start) {
return true // Stop when slug no longer matches the prefix
}
println(" > " + slug) // Output: "blog", "blog/a", "blog/a/1", etc.
return false
}) Without a clear end point, we manually check if each item's slug still matches the prefix, which can be inefficient. Example 2: Iteration Using ID Prefixes (courtesy of @jeronimoalbi)By adding ID prefixes to slugs (e.g., start := "1_" // ID is used as the start instead of slug
end := "2" // End is expressed as string(ID+1)
tree.Iterate(start, end, func(slug string, _ interface{}) bool {
println(" > " + slug) // Output: "1_blog", "1_blog/a", "1_blog/b", etc.
return false
}) Using ID prefixes allows iteration to be more efficient, as it avoids the need for string matching checks at each step. AVL Tree as the Core Data StructureThe AVL tree is proposed as the primary data structure for storing posts, keyed by the slug. This ensures that all posts are stored in a balanced and efficient manner. Although AVL trees offer O(log n) complexity for insertions and searches, concerns were raised about their efficiency for title-based searches. Searching for a post by its title would require traversing the entire tree, which is inefficient for large datasets. Warning Global searches, particularly for non-slug fields like titles, might introduce inefficiencies especially for large datasets. Handling Global SearchTitle-based searches across the AVL tree are inefficient. Supplementary structures or inverted indexes are considered for fast lookups by title or other non-hierarchical fields. The slug structure remains useful for hierarchical queries, while additional indexing will handle broader searches. Differentiating Boards and Posts in PracticeStructurally, boards and posts are identical as posts. They can be distinguished by labels or type fields. For instance, boards can have the label Iteration & Data AccessEfficient iteration is crucial, especially when accessing data chunks. All posts and sub-posts should be retrievable without multiple calls when iterating through a board. This is achievable by iterating over the slug hierarchy, using the prefix to access relevant posts quickly. AVL tree traversal allows for chunked access, ensuring that only relevant data is loaded, which optimizes performance. Constraints of a Single AVL TreeManaging everything in a single AVL tree can lead to inefficiencies. Handling different aspects of the hierarchy (boards, posts, comments) within a single tree might not scale well, especially for complex queries like global title searches. The use of multiple AVL trees—one for each level of the hierarchy—or separate trees for posts and comments is being considered to improve performance. Tip Multiple AVL trees could be necessary to optimize data handling, improving performance by distributing the load across different trees for specific types of queries. Complexity Analysis:Warning This assumes the implementation in gno.land and needs checking Summary of Time Complexity
Summary of Space Complexity
Context:For basic operations like insertion and search, we use an AVL tree as our core data structure. AVL trees are designed to remain balanced, which means their height is always logarithmic in proportion to the number of nodes. This is why insertion and search take O(log n) time. Every time a new post is added or searched for, the AVL tree needs to rebalance itself to ensure efficient future operations. When iterating through posts using plain slugs (without ID prefixes), the system needs to compare each post's slug with a target prefix (using something like The idea behind ID prefixes is to speed up this process. If every post’s slug has a unique numeric prefix (like Then, global searches, like looking for a post by its title, present more of a challenge with AVL trees. Since the tree is organized by slugs, not titles, you’d have to traverse the entire tree to check each title. This takes O(n) time. This inefficiency is why we mentioned possibly adding supplementary structures for fields like titles in the future. But, without them, a title search in an AVL tree is linear. |
[26-09-2024]: Originally shared internally During the sync today, we spent some time catching up with Kirk and discussing previous topics. Jerónimo has also shared additional sample work and notes:
We discussed whether posts should reference parent posts directly, and the impact on traversal and efficiency. Another key challenge is naming and structuring the package, considering the unified "posts" structure, given its hierarchical nature. Simply calling it "posts" could be misleading and may not reflect its full functionality.
|
[Originally shared internally]: Based on what Ilker shared with the help option that constructs a tx for function calls in the CLI—there could be something implemented in gno.land that resembles Actions and Blinks (in Solana), and used for Blinks could be shareable links or URLs that let users initiate on-chain actions, like voting, posting, or commenting on boards. Someone could click a link in a message, on a board, or even on other social media, and it would open their wallet to approve the action without needing to directly interact with the dApp. Actions could then handle the automation of common tasks like posting or moderating (across one or more boards). Instead of crafting a transaction from scratch, the user could trigger pre-built transactions that execute these tasks, making interactions with boards more intuitive, especially for those less familiar with blockchain. An example of this is the |
[30-09-2024]: Note TL;DR:
Definition Proposal for Generic
|
Operation | Node Reference | Full Subtree Duplication | Copy-On-Write (COW) |
---|---|---|---|
Fork Creation | O(1) (reference only) | O(k) (k = size of subtree) | O(1) (reference only) |
Fork Modification | O(log n) | O(log n) | O(log n) |
Insertion (Post/Thread) | O(log n) | O(log n) | O(log n) |
Search (Slug) | O(log n) | O(log n) | O(log n) |
Iteration (All Posts) | O(n) | O(n) | O(n) |
Iteration by Level (Prefix) | O(k + log n) | O(k + log n) | O(k + log n) |
Global Search (Title) | O(n) | O(n) | O(n) |
Summary of Space Complexity:
Operation | Node Reference | Full Subtree Duplication | Copy-On-Write (COW) |
---|---|---|---|
Fork Creation | O(1) | O(k) | O(1) |
Fork Modification | O(log n) | O(log n) | O(n + k) (forked changes) |
AVL Tree | O(n) | O(n + k) | O(n + m × log n) |
Slug Structure | O(n) | O(n + k) | O(n + m × log n) |
Context:
Three distinct strategies were explored, each presenting different trade-offs in terms of time and space complexity:
-
Node Reference: Fork creation in Node Reference only involves referencing the original post without duplicating any data, resulting in O(1) time complexity. No data is copied until a modification occurs, which triggers O(log n) complexity for updating the AVL tree and copying the necessary nodes. This method is most effective when frequent forks are created that remain largely unchanged.
-
Full Subtree Duplication: In this strategy, the entire subtree is copied during the fork, incurring O(k) time complexity, where
k
represents the size of the subtree. This ensures the forked post is independent from the original, but at a high initial cost. Post-fork modifications, however, follow standard AVL tree operations at O(log n). Full Subtree Duplication is best suited for forks that will undergo extensive independent modifications. -
Copy-On-Write (COW): Initially, COW behaves similarly to Node Reference, with O(1) fork creation as no data is copied at first. When modifications are made, affected nodes are duplicated on demand, leading to O(log n) complexity for updates. While the initial cost is low, the complexity of handling deep changes mirrors that of Full Subtree Duplication. COW works well in scenarios where forks are frequent but modifications are rare or shallow.
For basic operations such as insertion and search using slugs, AVL trees maintain O(log n) complexity. However, iteration over all posts requires visiting each node, resulting in O(n) complexity. When iterating by hierarchical level (prefix-based), the complexity becomes O(k + log n), where k
accounts for the size of the subtree at that level.
The use of multiple AVL trees, where each level (boards, threads, comments) is separated into its own tree, localizes operations. For example, inserting a comment does not affect the tree managing boards, keeping operations efficient within the appropriate scope. This approach reduces unnecessary traversal and allows for more targeted interactions within specific levels of the hierarchy.
Global searches, particularly those based on titles, are more challenging. Since the AVL trees planned are keyed by slugs, title-based searches require traversing the entire tree, leading to O(n) complexity. Without supplementary indexing structures for non-slug fields like titles, global searches become inefficient in large datasets.
Forking strategies that reduce upfront costs, like Node Reference and COW, shift the complexity to the modification phase. Full Subtree Duplication avoids this, ensuring immediate independence for the forked posts, though at a significant initial cost. Iteration complexity remains O(n) across all strategies, with efficiencies gained when operations are scoped to specific trees in the multiple AVL tree model.
Gas Considerations:
Node Reference:
Fork creation with Node Reference incurs minimal gas costs, as no data is copied and only a reference to the original post is stored.
When modifications are made, gas costs increase due to the need to copy affected nodes and update the AVL tree. The costs scale with the depth of the modification, as each change requires writing to storage.
This method works well for frequent forks with few expected modifications, keeping the initial gas costs low.
Full Subtree Duplication:
Fork creation using Full Subtree Duplication incurs higher gas costs because the entire subtree, including all nodes and children, is duplicated and written to storage.
Once the fork is created, further modifications have lower gas costs, as the forked data is independent and does not require further copying.
This approach is suitable when the forked data needs to be modified significantly and independently from the original.
Copy-On-Write (COW):
COW has low gas costs during fork creation, as no data is copied initially—only references to the original nodes are created.
When modifications are made, gas costs increase due to the need to duplicate the affected nodes and write the changes to storage. These costs scale with the depth of the modifications.
COW is useful when forks are frequent but are not expected to undergo extensive modifications.
[30-09-2024]: Action Items:
|
Additional Suggestions and Actions:The following suggestions focus on expanding the architecture and enabling potential functionality for developers building and using a "board" as a realm. Some ideas may be better suited for future versions (e.g., "v3") but are outlined here for consideration:
|
[02-10-2024]: We will use a feature branch on the gno core repo to leverage existing CI. |
How to Name?Common Platform Concepts:
Overview of Platform Content Structure:Here is an overview of common platforms and their respective content structures and interaction models:
|
As I understand it, a transaction is processed as a single thread, and multiple transactions are processed in sequence. (This is necessary for determinism.) I don't think gno.land has an issue with "simultaneous modifications" or need for locks. Is that right? |
I'm not sure about whether it is a good idea to unify boards and posts in a single data structure. Consider what you want to achieve, in terms of end-user experience, and then try to work the data structures towards that. Trying to mash them together into a single data structure may lead to a lot of places where the logic is fundamentally different |
Similarly, I don't see it as strictly necessary to put everything into a single avl.Tree. As I wrote separately on Slack, if forkability is a requirement, you should maybe consider writing a data structure which is a merkle tree + avl tree; if made properly, this can make it very easy to fork a board into its own both in the same realm and from a different realm entirely, as the state is identified by a hash which uniquely "summarizes" the state of the boards. (For reference, this is how git works; and I think we can all agree that git's branching works fairly well ;) ) Keep in mind it's generally good to use avl.Tree's as substitute for SQL indices and sorting mechanisms. A good strategy needs to be figured out; but keep in mind that if two avl.Tree's point to the same |
Personally, I see global searches as being mostly a feature that can be developed with a custom, off-chain indexer. My general advice is to put on-chain all data and methods pertaining to the boards directly; but we should offer off-chain mechanisms which can vastly improve the UX, such as search. But attempting to do full-text-search on-chain is unlikely to scale well; this may change at some point (ie. if FTS is something that is very often implemented by off-chain indexers, it may make sense to figure out how such functionality can be worked into realms directly, eventually). |
Thanks for the feedback! |
The core team has provided a useful issue on outstanding items related to boards: |
Context:
Tracking issue for specification efforts for
boards2
.Related Notes & Links:
/r/demo/boards2/
The text was updated successfully, but these errors were encountered: