Tool in Mylly: VRT manipulation / Validate VRT document
Validate VRT document
Check a purported VRT document for various issues and produce a validation report.
Parameters specify the level of detail of the report.
- error, warning, or info (defaults to warning) —
when set to error, only report errors; when set to
info, also report issues that are not problems at
all but may be good to know
- verbose or summary (defaults to verbose) — when
verbose, report the line number of each occurrence
of each kind of issue; when summary, report the
counts of each kind of issue but the line number for
only the first occurrence
Suppose the following document is meant to be in the VRT format.
<!DOCTYPE html> <text license="not to make nay use of this data"> <sentence> five years <sentence>
The document is full of problems: The first line starts with a ”left angle bracket” (less-than sign) but is neither a tag nor a comment. The second and third line would be a valid start tag of a text element if they were on one line, but neither line is valid VRT on their own. The second start tag of a sentence element should be an end tag — or the actual error might be something else, like many lines missing at this point. And there is no end tag to the text element or to a sentence element.
There is also an apparent spelling error (”nay” for ”any”) in the intended attribute value, and likely no legal basis to attempt to prohibit any use of the data (which looks quite trivial). The validator cannot address such issues.
The validation report is laid out as the following kind of table where each row documents a detected type of issue, together with the line number of an instance. The actual problematic content is not shown in the report. (Previous attempts to show the content in the reports have made the reports overly verbose. Some kinds of problems may also make the content awkward to display.)
|7||nest||error||element already open: sentence|
|8||nest||error||element not closed: sentence|
The ”bare angle” refers to the ”right angle bracket” (greater-than sign) that should be encoded as
> in content. The line looks like a content line to the validator because it does not start with a left angle bracket.
In this report, the validator has not recognized the start tag of the text element, so it cannot detect that the corresponding end tag is missing.
The line number 8, where missing end tags are reported, is one past the end of the document.
An actual report may vary depending on the current version of the validator.
- TO BE
- VRT format in the Language Bank of Finland