Search Results for

    Show / Hide Table of Contents

    Class Segment

    Represents a segment, which is a sequence of SegmentElements, in a particular language.

    Inheritance
    object
    Segment
    Inherited Members
    object.Equals(object)
    object.Equals(object, object)
    object.ReferenceEquals(object, object)
    object.GetType()
    object.MemberwiseClone()
    Namespace: Sdl.LanguagePlatform.Core
    Assembly: Sdl.LanguagePlatform.Core.dll
    Syntax
    [DataContract]
    public class Segment

    Constructors

    Segment()

    Initializes a new instance with the InvariantCulture, and an empty list of elements.

    Declaration
    public Segment()

    Segment(CultureCode)

    Initializes a new instance with the specified culture, and an empty list of elements.

    Declaration
    public Segment(CultureCode culture)
    Parameters
    Type Name Description
    CultureCode culture

    The CultureCode object representing the language.

    Properties

    Culture

    Gets or sets the culture for this segment.

    Declaration
    public CultureCode Culture { get; set; }
    Property Value
    Type Description
    CultureCode

    CultureName

    Gets or sets the culture name for this segment. The culture name must be resolvable through CultureInfoExtensions.GetCultureInfo(string), or an exception will be thrown.

    Declaration
    [DataMember]
    public string CultureName { get; set; }
    Property Value
    Type Description
    string

    Elements

    Gets or sets the collection of elements in this segment.

    Declaration
    [DataMember]
    public List<SegmentElement> Elements { get; set; }
    Property Value
    Type Description
    List<SegmentElement>

    HasPairedTags

    Gets a value which indicates whether this segment contains any paired tags. Only start tags are checked, it is assumed that the tag structure is valid.

    Declaration
    public bool HasPairedTags { get; }
    Property Value
    Type Description
    bool

    HasPlaceables

    Gets a bool value which indicates whether this segment contains any placeables. Note that the return value is only valid if the segment is tokenized.

    Declaration
    public bool HasPlaceables { get; }
    Property Value
    Type Description
    bool

    HasTags

    Gets a value which indicates whether this segment contains any tags.

    Declaration
    public bool HasTags { get; }
    Property Value
    Type Description
    bool

    IsEmpty

    Gets a value indicating whether this instance contains any elements (false) or not (true).

    Declaration
    public bool IsEmpty { get; }
    Property Value
    Type Description
    bool

    LastElement

    Gets or sets the last element of this segment.

    Declaration
    public SegmentElement LastElement { get; set; }
    Property Value
    Type Description
    SegmentElement

    Tokens

    Gets or sets the collection of tokens in this segment.

    Declaration
    [DataMember]
    public List<Token> Tokens { get; set; }
    Property Value
    Type Description
    List<Token>

    Methods

    Add(SegmentElement)

    Adds the provided segment element to the segment's list of elements. When adding a text element, and the last segment element is a text element as well, they will be merged.

    Declaration
    public void Add(SegmentElement element)
    Parameters
    Type Name Description
    SegmentElement element

    The element to append

    Add(string)

    Adds the provided string as a new text element to the segment's list of elements. If the last segment element is a Text element as well, they will be merged.

    Declaration
    public void Add(string text)
    Parameters
    Type Name Description
    string text

    The text to append

    AddRange(IEnumerable<SegmentElement>)

    Adds all segment elements in the collection to this segment.

    Declaration
    public void AddRange(IEnumerable<SegmentElement> elements)
    Parameters
    Type Name Description
    IEnumerable<SegmentElement> elements

    The elements to add

    AnchorDanglingTags()

    Sets the anchor for any tags which are not yet anchored (including standalone/placeholder tags). Does not modify tag IDs or alignment anchors.

    Declaration
    public void AnchorDanglingTags()

    Clear()

    Empties the list of segment elements.

    Declaration
    public void Clear()

    ComputeStrictIdentityStringAsync()

    Gets a strict identity string - use with GetStrictHash()

    Declaration
    public Task<string> ComputeStrictIdentityStringAsync()
    Returns
    Type Description
    Task<string>

    ComputeStrictIdentityStringAsync(IEnumerable<Token>)

    Generate strict identity string (not intended for fuzzy matching)

    Declaration
    public static Task<string> ComputeStrictIdentityStringAsync(IEnumerable<Token> tokens)
    Parameters
    Type Name Description
    IEnumerable<Token> tokens
    Returns
    Type Description
    Task<string>

    DeleteEmptyTagPairs(bool)

    Deletes empty tag pairs (a start tag directly followed by the end tag with the same tag anchor) from the segment.

    Declaration
    public bool DeleteEmptyTagPairs(bool onlyInPeripheralPositions)
    Parameters
    Type Name Description
    bool onlyInPeripheralPositions

    If true, will delete empty tag pairs only if they appear in peripheral positions (leading, trailing).

    Returns
    Type Description
    bool

    true if any tags were deleted, and false otherwise.

    DeleteTags()

    Removes all tags from the segment, applying the DeleteAll tag deletion mode.

    Declaration
    public bool DeleteTags()
    Returns
    Type Description
    bool

    true if any tags were deleted, and false otherwise.

    DeleteTags(DeleteTagsAction)

    Removes all tags from the segment, applying the specified tag deletion mode.

    Declaration
    public bool DeleteTags(Segment.DeleteTagsAction mode)
    Parameters
    Type Name Description
    Segment.DeleteTagsAction mode

    The tag deletion mode

    Returns
    Type Description
    bool

    true if any tags were deleted, and false otherwise.

    Duplicate()

    Creates a new instance that is a deep copy of this instance.

    Declaration
    public Segment Duplicate()
    Returns
    Type Description
    Segment

    A new instance that is a deep copy of this instance.

    Equals(Segment)

    Compares this instance to another Segment object.

    Declaration
    public bool Equals(Segment other)
    Parameters
    Type Name Description
    Segment other

    The other instance.

    Returns
    Type Description
    bool

    true if the language and all the elements are the same, otherwise false.

    FillUnmatchedStartAndEndTags()

    Inserts corresponding start and end tags for unmatched end and start tags to the segment. For unmatched end tags, the corresponding start tags are inserted at the beginning of the segment. Corresponding end tags for unmatched start tags are added at the end. In certain cases, not all dangling tags can be filled, and in order to obtain a valid segment without any unmatched tags, RemoveUnmatchedStartAndEndTags(bool) should be called after calling this method. Note that only the tag type is checked, not whether there are start or end tags without a corresponding tag having the same tag anchor.

    The method will discontinue if the tag pairing structure is incorrect (i.e. if there are overlapping tags).

    Declaration
    public bool FillUnmatchedStartAndEndTags()
    Returns
    Type Description
    bool

    true if the segment was modified, and false otherwise. Note that after calling this method, there may still be unmatched start or end tags in the segment.

    FindTag(TagType, int)

    Finds and returns the tag with the provided type and the provided tag anchor, or null if no such tag exists in the segment.

    Declaration
    public Tag FindTag(TagType type, int anchor)
    Parameters
    Type Name Description
    TagType type
    int anchor
    Returns
    Type Description
    Tag

    GetHashCode()

    GetHashCode()

    Declaration
    public override int GetHashCode()
    Returns
    Type Description
    int

    A hash code for this object

    Overrides
    object.GetHashCode()

    GetMaxTagAnchor()

    Returns the highest tag anchor used in the segment, or 0 if no tags are present.

    Declaration
    public int GetMaxTagAnchor()
    Returns
    Type Description
    int

    GetMinMaxTagAnchor(out int, out int)

    Returns the smallest and largest tag anchor used in the segment. Both default to 0.

    Declaration
    public void GetMinMaxTagAnchor(out int min, out int max)
    Parameters
    Type Name Description
    int min
    int max

    GetTagCount()

    Returns the number of tags in the segment. Paired tags are counted only once.

    Declaration
    public int GetTagCount()
    Returns
    Type Description
    int

    The segment's tag count

    GetTagIdGroups()

    Computes a mapping from the start tag token index to that tag's tag ID. Only start and standalone/placeholder tags are included in the mapping. The mapping may be n:1. The segment must be tokenized, or an exception is thrown.

    Declaration
    public Dictionary<int, string> GetTagIdGroups()
    Returns
    Type Description
    Dictionary<int, string>

    GetTagPairings()

    Returns a dictionary of paired tag token indices, mapping from the start tag's token index to the end tag's token index. The segment must be tokenized, or an exception is thrown.

    Declaration
    public Dictionary<int, int> GetTagPairings()
    Returns
    Type Description
    Dictionary<int, int>

    GetTokenIndex(SegmentPosition)

    Returns the index of the token at the specified position.

    Declaration
    public int GetTokenIndex(SegmentPosition p)
    Parameters
    Type Name Description
    SegmentPosition p
    Returns
    Type Description
    int

    The index of the token at the specified position, or -1 if it is not found, or if the segment is not tokenized.

    GetWeakHashCode()

    Returns a hash code which does not depend on tag anchors in the segment. This can be used for translation tracking in bilingual documents.

    Declaration
    public int GetWeakHashCode()
    Returns
    Type Description
    int

    A hash code which is independent of tag anchors.

    HasPeripheralWhitespace()

    Determines whether the segment starts or ends with at least one whitespace character.

    Declaration
    public bool HasPeripheralWhitespace()
    Returns
    Type Description
    bool

    HasTokenBundles()

    Returns true if any of the segment's tokens is a TokenBundle (i.e. an ambigous tokenization), and false otherwise. Token bundles should only be used inside the TM Kernel and not be returned through the TM API.

    Declaration
    public bool HasTokenBundles()
    Returns
    Type Description
    bool

    HasUnmatchedStartOrEndTags()

    Determines whether the segment has any unmatched start or end tags. Note that this method only tests the tag type, and does not handle paired tags where the start or end tag are missing.

    Declaration
    public bool HasUnmatchedStartOrEndTags()
    Returns
    Type Description
    bool

    true if the segment contains any unmatched start or end tags, and false otherwise.

    IsValid()

    Determines if this segment is valid.

    Declaration
    public bool IsValid()
    Returns
    Type Description
    bool

    true if the segment is valid, false othwerwise.

    MergeAdjacentTextRuns()

    Merges adjacent text runs.

    Declaration
    public void MergeAdjacentTextRuns()

    RemoveTokenBundles()

    Replaces token bundles with the "best" token in that bundle. Returns true if any replacement has been done, and false otherwise.

    Declaration
    public bool RemoveTokenBundles()
    Returns
    Type Description
    bool

    RemoveUnmatchedStartAndEndTags()

    Deletes all tags from the segment which have a tag type of Core.TagType.UnmatchedStart or Core.TagType.UnmatchedEnd. Note that this method only tests the tag type, and does not handle paired tags where the start or end tag are missing.

    Declaration
    public bool RemoveUnmatchedStartAndEndTags()
    Returns
    Type Description
    bool

    true if the segment was modified, and false otherwise.

    RemoveUnmatchedStartAndEndTags(bool)

    Deletes all tags from the segment which have a tag type of Core.TagType.UnmatchedStart or Core.TagType.UnmatchedEnd, if these tags occur in peripheral positions, which means that dangling end tags are only removed if they appear at the start of the segment, and dangling start tags are only removed if they appear at the end of the segment, with no other tags or text preceding the tag (in case of segment-initial dangling end tags), or following the tag (for segment-trailing dangling start tags).

    Note that this method only tests the tag type, and does not handle paired tags where the start or end tag are missing.

    Declaration
    public bool RemoveUnmatchedStartAndEndTags(bool peripheralPositionsOnly)
    Parameters
    Type Name Description
    bool peripheralPositionsOnly
    Returns
    Type Description
    bool

    true if the segment was modified, and false otherwise.

    RenumberTagAnchors(int, ref int)

    Renumbers tag anchors, starting at nextTagAnchor, in a consecutive manner. Although tag anchors have no semantics for standalone tags, they are also anchored in the same manner. Errors in tag numbering will be ignored (but preserved, i.e. invalid tag anchors will be mapped to potentially new, also invalid tag anchors).

    Declaration
    public bool RenumberTagAnchors(int nextTagAnchor, ref int maxAlignmentAnchor)
    Parameters
    Type Name Description
    int nextTagAnchor

    The first anchor to assign (must be larger than zero)

    int maxAlignmentAnchor

    Returns the highest alignment anchor in the renumbered segment.

    Returns
    Type Description
    bool

    true if the any anchors were reassigned, and false otherwise.

    RenumberTagAnchors(ref int)

    Renumbers tag anchors so that they start at 1 and are consecutive. Although tag anchors have no semantics for standalone tags, they are also anchored in the same manner. Errors in tag numbering will be ignored (but preserved, i.e. invalid tag anchors will be mapped to potentially new, also invalid tag anchors).

    Declaration
    public bool RenumberTagAnchors(ref int maxAlignmentAnchor)
    Parameters
    Type Name Description
    int maxAlignmentAnchor
    Returns
    Type Description
    bool

    true if the any anchors were reassigned, and false otherwise.

    ToPlain()

    Returns a string containing only the plain text in this segment. Note that text placeholders will be replaced with their text equivalent.

    Declaration
    public string ToPlain()
    Returns
    Type Description
    string

    A string containing only the plain text in this segment.

    ToPlain(SegmentRange)

    Computes the plain-text version of the part of the segment specified by the provided range.

    Declaration
    public string ToPlain(SegmentRange range)
    Parameters
    Type Name Description
    SegmentRange range

    The range of the segment to convert

    Returns
    Type Description
    string

    The plain-text string corresponding to the provided range.

    ToPlain(bool, bool, out List<SegmentPosition>)

    Computes the plain-text version of the segment and returns, in the ranges list, the segment range of each character of the result string. The number of elements in that collection will be equal to the length of the string in characters.

    Declaration
    public string ToPlain(bool tolower, bool tobase, out List<SegmentPosition> ranges)
    Parameters
    Type Name Description
    bool tolower

    If true, the returned string will be lower-cased

    bool tobase

    If true, all letters will be mapped to their base character (i.e. diacritics will be stripped)

    List<SegmentPosition> ranges

    A reference to the list of segment ranges which will be returned upon completion. The list includes, for each character in the result string, the position in the original segment.

    Returns
    Type Description
    string

    ToPlain(int, int)

    Returns a string containing only the plain text in this segment, covering the given token range. An exception will be thrown if the segment's tokens are not set or the token range is outside the bounds.

    Declaration
    public string ToPlain(int fromToken, int intoToken)
    Parameters
    Type Name Description
    int fromToken

    The index of the first token

    int intoToken

    The index of the last token (inclusive, i.e. "into" semantics)

    Returns
    Type Description
    string

    A plain text string covering the specified token range

    ToString()

    ToString()

    Declaration
    public override string ToString()
    Returns
    Type Description
    string

    A string representation of the object, for display purposes.

    Overrides
    object.ToString()

    Trim()

    Removes leading whitespace from the first segment element, if that is a text element, and trailing whitespace from the last segment element, if that is a text element. If the first/last segment element is not a text element, it will not be altered. Also, leading (trailing) whitespace will not be removed from a text element if it is preceded (followed) only by non-text elements. Also deletes any null elements.

    Declaration
    public void Trim()

    TrimEnd()

    Removes trailing whitespace from the last segment element, if that is a text element. If the last segment element is not a text element, nothing will happen. Hence, trailing whitespace will not be removed from a text element if it is followed by non-text elements. The number of elements may be altered by this method. Empty (null) elements will also be removed.

    Declaration
    public string TrimEnd()
    Returns
    Type Description
    string

    A string consisting of the trimmed-off characters, or null if no characters have been trimmed off

    TrimStart()

    Removes leading whitespace from the first segment element, if that is a text element. If the first segment element is not a text element, nothing will happen. Hence, leading whitespace will not be removed from a text element if it is preceded by non-text elements. The number of elements may be altered by this method. Empty (null) elements will also be removed.

    Declaration
    public string TrimStart()
    Returns
    Type Description
    string

    A string consisting of the trimmed-off characters, or null if no characters have been trimmed off

    UpdateFromTokenIndices(ICollection<int>)

    Updates the segment's text from the tokens, and adjusts span indices accordingly. An exception is thrown if the segment is not tokenized.

    Declaration
    public bool UpdateFromTokenIndices(ICollection<int> tokenIndices)
    Parameters
    Type Name Description
    ICollection<int> tokenIndices

    The list of tokens to update.

    Returns
    Type Description
    bool

    true if the segment was changed, and false otherwise.

    Validate()

    Validates the current instance, with the ReportAllErrors validation mode.

    Declaration
    public ErrorCode Validate()
    Returns
    Type Description
    ErrorCode

    An error code (which may be OK, indicating the segment is valid).

    Validate(ValidationMode)

    Performs validation checks on this instance, applying the specified validation mode.

    Declaration
    public ErrorCode Validate(Segment.ValidationMode mode)
    Parameters
    Type Name Description
    Segment.ValidationMode mode

    The validation mode to apply

    Returns
    Type Description
    ErrorCode

    An error code (which may be OK, indicating the segment is valid).

    VerifyTokenSpans()

    Verifies whether the spans of the segment's tokens are correct and reflect the segment's text. Note that the segment should be tokenized. If not, true is returned.

    Declaration
    public bool VerifyTokenSpans()
    Returns
    Type Description
    bool

    true if the verification was successful or the segment is not tokenized, and false otherwise.

    WeakEquals(Segment)

    Computes weak equality with another segment.

    Weak equality does not check culture compatibility and tag anchors do not need to be identical, but text elements must match, as well as the order of tags (element similarity must not be None)

    Declaration
    public bool WeakEquals(Segment other)
    Parameters
    Type Name Description
    Segment other
    Returns
    Type Description
    bool
    In this article
    • Constructors
      • Segment()
      • Segment(CultureCode)
    • Properties
      • Culture
      • CultureName
      • Elements
      • HasPairedTags
      • HasPlaceables
      • HasTags
      • IsEmpty
      • LastElement
      • Tokens
    • Methods
      • Add(SegmentElement)
      • Add(string)
      • AddRange(IEnumerable<SegmentElement>)
      • AnchorDanglingTags()
      • Clear()
      • ComputeStrictIdentityStringAsync()
      • ComputeStrictIdentityStringAsync(IEnumerable<Token>)
      • DeleteEmptyTagPairs(bool)
      • DeleteTags()
      • DeleteTags(DeleteTagsAction)
      • Duplicate()
      • Equals(Segment)
      • FillUnmatchedStartAndEndTags()
      • FindTag(TagType, int)
      • GetHashCode()
      • GetMaxTagAnchor()
      • GetMinMaxTagAnchor(out int, out int)
      • GetTagCount()
      • GetTagIdGroups()
      • GetTagPairings()
      • GetTokenIndex(SegmentPosition)
      • GetWeakHashCode()
      • HasPeripheralWhitespace()
      • HasTokenBundles()
      • HasUnmatchedStartOrEndTags()
      • IsValid()
      • MergeAdjacentTextRuns()
      • RemoveTokenBundles()
      • RemoveUnmatchedStartAndEndTags()
      • RemoveUnmatchedStartAndEndTags(bool)
      • RenumberTagAnchors(int, ref int)
      • RenumberTagAnchors(ref int)
      • ToPlain()
      • ToPlain(SegmentRange)
      • ToPlain(bool, bool, out List<SegmentPosition>)
      • ToPlain(int, int)
      • ToString()
      • Trim()
      • TrimEnd()
      • TrimStart()
      • UpdateFromTokenIndices(ICollection<int>)
      • Validate()
      • Validate(ValidationMode)
      • VerifyTokenSpans()
      • WeakEquals(Segment)
    Back to top Generated by DocFX