Search Results for

    Show / Hide Table of Contents

    Namespace Sdl.LanguagePlatform.Core.Tokenization

    Classes

    AutoLocalizationSettings

    Contains specialized settings for auto-localization of tokens.

    CurrencyFormat

    Defines a currency symbol (e.g. $, £, USD) along with permissible options for positioning and separator

    CustomUnitDefinition

    Provides additional metadata for a custom unit when creating a recognizer for Measurement

    DateTimeToken

    A Token which represents a date or time expression.

    GenericPlaceableToken

    Represents a generic, abstract token, which is a sequence of characters in the input. A token is identified using a tokenizer, which breaks up the sequence of characters in the input into a sequence of tokens. That token sequence is non-overlapping, but not necessarily contiguous.

    Match

    A match object which is returned by FST, FSA, or regex matches

    MeasureToken

    A Token which represents a measurement, which consists of a numeric value and a unit.

    NumberToken

    A Token which represents a numeric value.

    PrioritizedToken

    A Token with an assigned priority, usually originating from a recognizer's priority. This class is for internal purposes only and should not be used in third-party applications.

    SimpleToken

    Represents a generic, abstract token, which is a sequence of characters in the input. A token is identified using a tokenizer, which breaks up the sequence of characters in the input into a sequence of tokens. That token sequence is non-overlapping, but not necessarily contiguous.

    TagToken

    Represents a generic, abstract token, which is a sequence of characters in the input. A token is identified using a tokenizer, which breaks up the sequence of characters in the input into a sequence of tokens. That token sequence is non-overlapping, but not necessarily contiguous.

    Token

    Represents a generic, abstract token, which is a sequence of characters in the input. A token is identified using a tokenizer, which breaks up the sequence of characters in the input into a sequence of tokens. That token sequence is non-overlapping, but not necessarily contiguous.

    TokenBundle

    A special Token which represents a set of alternatives (i.e. an ambiguous analysis) of other tokens which cover the exactly same input span.

    TokenizationContext

    Holds additional metadata for tokenization for a given culture, such as any custom formats for Number, Date etc.

    Interfaces

    ILocalizableToken

    Defines the interface for auto-localizable tokens. Localizable tokens have a value, and their surface representation ("text") can be automatically converted into a target culture representation, given the token's value and the target culture.

    Enums

    BuiltinRecognizers

    Enumerates the known types of special token recognizers.

    CurrencySymbolPosition

    Defines the permissible positions for a currency symbol with respect to the currency amount

    DateTimePatternType

    Enumerates the different types of a date or time pattern.

    LocalizationParametersSource

    Controls which tokens are used to obtain detailed localization parameters, such as the numeric group separator override, or whitespace handling between a number and the unit in measurements.

    NumericSeparator

    The numeric separators type which can occur in a number token.

    Sign

    The sign of a number

    TokenType

    The type of a token, e.g. whether the token represents a word, punctuation, etc.

    TokenizerFlags

    Flags controlling tokenizer behaviour

    Unit

    Enumerates the units known by the system. Only those units are listed which may require cross-system conversion (not yet implemented).

    UnitSeparationMode

    Controls how units are separated from the numeric value in measurements.

    In this article
    • Classes
    • Interfaces
    • Enums
    Back to top Generated by DocFX