SakuraKit is a Swift SDK designed to quickly prototyping speech-to-speech or text-to-speech using different APIs to build low-latency, multimodal experiences with ease.
This SDK is named after the cherry blossoms (Sakura) to enjoy in Shibuya next year. 🌸
To get started with SakuraKit, add it to your Swift project using Swift Package Manager (SPM):
dependencies: [
.package(url: "https://github.com/rryam/SakuraKit", from: "0.1.0")
]
Then, import it into your project:
import SakuraKit
- Play.ht API Key and User ID: Required for text-to-speech functionality.
Initialize the Play.ht client:
Here is a quick example to get you started:
import SakuraKit
// Initialize the SakuraKit client
let playAI = PlayAI(apiKey: "your_playht_api_key", userId: "your_user_id")
// Create a PlayNote for generating audio from PDF:
let request = PlayNoteRequest(
sourceFileUrl: sourceURL,
synthesisStyle: .podcast,
voice1: .angelo,
voice2: .nia
)
let response = try await playAI.createPlayNote(request)
Available voice styles include:
- Podcast conversations
- Executive briefings
- Children's stories
- Debates
I welcome contributions! Feel free to open issues or submit pull requests to help improve SakuraKit.
SakuraKit is licensed under the MIT License. See LICENSE for more details.