Tool in Mylly: VRT manipulation / Validate VRT document
suomeksi

Validate VRT document

Check a purported VRT document for various issues and produce a validation report.

Parameters specify the level of detail of the report.

level
error, warning, or info (defaults to warning) —
when set to error, only report errors; when set to
info, also report issues that are not problems at
all but may be good to know
verbosity
verbose or summary (defaults to verbose) — when
verbose, report the line number of each occurrence
of each kind of issue; when summary, report the
counts of each kind of issue but the line number for
only the first occurrence

Example

Suppose the following document is meant to be in the VRT format.

    <!DOCTYPE html>
    <text license="not
    to make nay use of this data">
    <sentence>
    five
    years
    <sentence>
  

The document is full of problems: The first line starts with a ”left angle bracket” (less-than sign) but is neither a tag nor a comment. The second and third line would be a valid start tag of a text element if they were on one line, but neither line is valid VRT on their own. The second start tag of a sentence element should be an end tag — or the actual error might be something else, like many lines missing at this point. And there is no end tag to the text element or to a sentence element.

There is also an apparent spelling error (”nay” for ”any”) in the intended attribute value, and likely no legal basis to attempt to prohibit any use of the data (which looks quite trivial). The validator cannot address such issues.

The validation report is laid out as the following kind of table where each row documents a detected type of issue, together with the line number of an instance. The actual problematic content is not shown in the report. (Previous attempts to show the content in the reports have made the reports overly verbose. Some kinds of problems may also make the content awkward to display.)

line kind level issue
1 form error malformed tag
2 form error malformed tag
3 data error bare angle
7 nest error element already open: sentence
8 nest error element not closed: sentence

The ”bare angle” refers to the ”right angle bracket” (greater-than sign) that should be encoded as &gt; in content. The line looks like a content line to the validator because it does not start with a left angle bracket.

In this report, the validator has not recognized the start tag of the text element, so it cannot detect that the corresponding end tag is missing.

The line number 8, where missing end tags are reported, is one past the end of the document.

An actual report may vary depending on the current version of the validator.

Related tools

  • TO BE

See also

Search the Language Bank Portal:
Tommi Kurki
Researcher of the Month: Tommi Kurki

 

Contact

The Language Bank's technical support:
kielipankki (at) csc.fi
tel. +358 9 4572001

Requests related to language resources:
fin-clarin (at) helsinki.fi
tel. +358 29 4140599 / +358 29 4129317