Skip to content

Simple Relational Database Management System Implementation in Golang

License

Notifications You must be signed in to change notification settings

ryogrid/SamehadaDB

Repository files navigation

test

Overview

  • Compact Relational Database Management System Implementation Wrote in Golang
  • Origin of SamehadaDB is Golang ported BusTub RDBMS: go-bustub

What is Samehada?

  • Samehada, which literally means shark skin, is a tool used to grate wasabi, usually for sushi, but also for other Japanese cuisines
  • Samehada features its grid shape that forms air bubbles between the grated wasabi, minimizing unnecessary spiciness
  • We are proud to call SamehadaDB because the grid produces a pleasant taste and aroma with a nice harmony
  • (Texts in this section is contributed by ujihisa. Thanks ujihisa)

Motivation

  • RDBMS implementation which is compact but has almost of essential componets and features implemted in practical techniques is needed for newbie of database system developer
  • SamehadaDB is developed to be a RDBMS like above
    • Golang has high readability
    • SamehadaDB is based on BusTub RDBMS which is used in educational purpose

Current Status

  • SamehadaDB can be used as simple embedded DB (library form)
  • And can be used as DB server which offers REST API I/F
    • Please see server directory and this note
    • The server listen on http://0.0.0.0:19999/Query (this means localhost, 127.0.0.1, other IP addresses your machine has)
      • Content-Type of request and response are "application/json"
    • Additionaly the server listen on http://0.0.0.0:19999/QueryMsgPack
      • Content-Type of request is "application/json" but one of response is "application/octet-stream"
      • Response is serialized binary in MessagePack specification
      • For understanding response data schema, easy way is reading demo client code (schema is almost same with JSON response)
    • There are Win binary and Linux binary at Release page
      • if Linux one runs without error at lauch, you are lucky :)
      • when error occurs, you need to build by myself :)
        • please check that value of EnableOnMemStorage constant is false and modify it if needed (/lib/common/config.go)
    • there is simple client which can be useed on browser in demo-client dir
      • very simple SPA. requirement is web browser which can executes JS only :)
        • It can access SamehadaDB server on your machine (localhost, 127.0.0.1, etc...)
      • It is also pleced at here for demonstration
        • ryogrid.net:19999 is public SamehadaDBd server for the demonstration
        • Do not insert private or secret information/data! the server does not do isolation
  • ATTENTION: SamehadaDB is not developed for productional use! There are no warranties!
  • By the way, procedure described on next section executes almost all defined unit tests except these of taking long time

Procedure of Executing SamehadaDB (executing unit tests)

  • Please install golang environment with package system your OS has (apt, yum, brew ...etc)
    • If you use Windows, you can select both Windows native environment and WSL Ubuntu environment
  • If you select Windows native environments or golang environment which is installed with package system can't execute SamehadaDB, you should install official binary directly
  • Executing all unit tests which test several features and components of SamehadaDB except several tests taking long time

Roadmap

  • Predicates on Seq Scan
  • Multiple Item on Predicate: AND, OR
  • Predicates: <, >, <=, >=
  • Null 1
  • Predicates: NOT 1
  • Inline types (integer, varchar, float, boolean, bigint, smallint, decimal, timestamp, datetime and etc)
  • Delete Tuple
  • Update Tuple
  • LIMIT / OFFSET 1
  • Varchar
  • Persistent Catalog
  • Updating of Table Schema
  • LRU replacer
  • Latches
  • Transactions
  • Rollback When Abort Occurs
  • Logging
  • Checkpointing
    • Simple Checkpointing (all transactions are blocked until finish of checkpointing)
    • Fuzzy Checkpointing (ARIES)
  • Recovery from Logs
  • Index
    • Hash Index 1
      • Hash index can be used only equal(=) operator is specified to index having columns
      • Thread safe but serialized (not supported concurrent access)
    • SkipList Index
      • Concurrent access is supported
    • B-tree Index 1
    • Logging And Recovery Of Index Data
  • JOIN
    • INNER JOIN (Hash Join, Index Join, Nested Loop Join)
      • Condition specified at ON clause should be composed of single item and can use equal(==) operator only
    • OUTER JOIN
    • CROSS JOIN
  • Aggregations (COUNT, MAX, MIN, SUM on SELECT clause including Group by and Having) 1
  • Sort (ORDER BY clause) 1
  • Concurrent Execution of Transactions
    • Concurrecy control protcol is Strong Strict 2-Phase Locking (SS2PL) and locking granularity is tuple level (record level)
    • Avoidance of phantom problem is not implemented yet
      • So, current transaction isolation level is "REPEATABLE READ"
    • And when table has Indexes, dirty read can be occured
      • So, when you use Indexes, isolation level is "DIRTY READ"
        • Why dirty read occurs is lack of consistency between table record and index entry at DELETE
  • Execution Planning from hard coded SQL like method call I/F (like some kind of embedded DB)
  • Execution of Query with SQL string
    • not supported multi queries on a request yet
  • Frontend Impl as Embedded DB Library (like SQLite)
  • Deduplication of Result Records (Distinct)
  • Query Optimization (Selinger)
    • cases below are not supported now
      • predicate including OR operation, NOT, IS NULL
      • projection including aggregation
      • LIMIT, ORDER BY
  • Statistics Data for Optimizer
    • Statistics data are updated continuously with full scan...
  • TRANSACTION Statement on SQL
    • This includes adding support of multi statements (multi statements is not suported on SQL now)
  • AS clause
  • Nested Query
  • Predicates: IN
  • DB Connector (Driver) or Other Kind of Network Access Interface
    • MySQL or PostgreSQL Compatible Protocol
    • REST
  • Deallocate and Reuse Page
  • Optimization of INSERT
    • Current implementation searches a free space which is enough for insert data with sequential scan from head (this is slow on exsisting large amount of records situation)
  • UNION clause
  • Materialization
    • Classes which offers functionality for materialization exists
    • Now, HashJoinExecutor only do materialization with the classes
  • Authentication
  • Making Usable from OR Mapper of One Web Framework Such as Django (Python) on Simple Application Scope
    • implementation of DB driver/connector for Python is needed (Or supporting major RDBMS compatible NW I/F)

Specification (Limitation)

  • Supported Data Types
    • Integer (int32)
    • Float (float32)
    • Varchar (variable length)
      • Max length is a little less than 4KB now
    • Boolean (bool/1byte)
  • to be wrote...

More Info

About Skip List Index SamehadaDB has

Related Project

Advisor

Acknowledgement

  • 2022 May - present: SamehadaDB project is assisted by OSS community support program of JetBrains s.r.o with offering development tools such as GoLand IDE

Past work

Footnotes

  1. Use on frontend is not supported now 2 3 4 5 6 7