Skip to content

Kotlin Multiplatform (KMP) library for string unicode normalization

License

Notifications You must be signed in to change notification settings

Doist/doistx-normalize

Repository files navigation

doistx-normalize

badge-version badge-android badge-jvm badge-js badge-ios badge-ios badge-ios badge-macos badge-windows badge-linux

Kotlin Multiplatform (KMP) library that adds support for normalization as described by Unicode Standard Annex #15 - Unicode Normalization Forms, by extending the String class with a normalize(Form) method.

All normalization forms are supported:

  • Form.NFC: Normalization Form C, canonical decomposition followed by canonical composition.
  • Form.NFD: Normalization Form D, canonical decomposition.
  • Form.NFKC: Normalization Form KC, compatibility decomposition followed by canonical composition.
  • Form.NFKD: Normalization Form KD, compatibility decomposition.

Usage

"Äffin".normalize(Form.NFC) // => "Äffin"
"Äffin".normalize(Form.NFD) // => "A\u0308ffin"
"Äffin".normalize(Form.NFKC) // => "Äffin"
"Äffin".normalize(Form.NFKD) // => "A\u0308ffin"

"Henry \u2163".normalize(Form.NFC) // => "Henry \u2163"
"Henry \u2163".normalize(Form.NFD) // => "Henry \u2163"
"Henry \u2163".normalize(Form.NFKC) // => "Henry IV"
"Henry \u2163".normalize(Form.NFKD) // => "Henry IV"

Setup

repositories {
   mavenCentral()
}

kotlin {
   sourceSets {
      val commonMain by getting {
         dependencies {
            implementation("com.doist.x:normalize:1.1.1")
         }
      }
   }
}

Development

Building this project can be tricky, as cross-compilation in KMP not widely supported. In this case:

  • macOS and iOS targets must be built on macOS.
  • Windows targets should be built on Windows (or a JDK under Wine).
  • Linux targets must be built on Linux due depending on libunistring.
  • JVM/Android and JS targets can be cross-compiled.

The defaults can be adjusted using two project properties:

  • targets is a string for which targets to build, test, or publish, depending on the task that runs.
    • all (default): All possible targets in the current host.
    • native: Native targets only (e.g., on macOS, that's macOS, iOS, watchOS and tvOS).
    • common: Common targets only (e.g., JVM, JS, Wasm).
    • host: Host OS only.
  • publishRootTarget is a boolean that indicates whether the kotlinMultiplatform root publication is included when publishing enabled targets (can only be done once).

When targets are built, tested and published in CI/CD, the Apple host handles Apple-specific targets, the Windows host handles Windows, and Linux handles everything else.

Release

To release a new version, ensure CHANGELOG.md is up-to-date, and push the corresponding tag (e.g., v1.2.3). GitHub Actions handles the rest.

License

Released under the MIT License.

Unicode's normalization test suite is subject to this license.

About

Kotlin Multiplatform (KMP) library for string unicode normalization

Resources

License

Security policy

Stars

Watchers

Forks

Packages

No packages published