XDXF standand; Draft 019 The purpose of XDXF standand is to specify an interchange format for free dictionaries. Later, scripts will be written to convert from different popular formats to XDXF, then, scripts for convertion from XDXF back to those popular formats, thus allowing to convert dictionaries from any format to any other. Also XDXF dictionaries might (and should :) be used directly. Each dictionary is located in its own folder, the name of the folder is used as the ID. So, if the dictionary name is "Webster's Unabridged Dictionary published in 1913" then the folder name should be something like "Webster1913". The dictionary file itself is always "dict.xdxf". It is recommended for each dictionary to have a set of icons for toolbars and a large icon for the front page. The sizes should be: 16x16, 32x32, 512x512. And the filenames would be icon16.png icon32.png and icon512.png respectively. Note that all file names are case sensitive. All XDXF dictionary text files (those with .xdxf extension) are in XML format with UTF-8 encoding. ------------------------------------------------------------------------------------ XDXF Tags: ------------------------------------------------------------------------------------ The root element. Information about the dictionary. The following tags are allowed only in between tags. Full name of the dictionary, like it would appear on the book cover. It may contain non-English symbols. ISO 639 three letter code of the language of the key-phrases. ISO 639 three letter code of the language of the meanings. Type can be one or more of the following: translation, explanatory, encyclopedia, spelling, transcription, audio. If more than one type specified they shall be separated by coma. Description of the dictionary in free words. It is recommended to include the following: Copyright, License, From where this file can be downloaded, From where can be downloaded the unformatted file, i.e. the original dictionary file before the conversion into XDXF format, From where the original unformatted dictionary file was downloaded, Link to the script which was used to convert the original unformatted dictionary file into XDXF format. The description may contain XHTML tags, that are allowed in XDXF and specified below. section is a list of tags. It describes abbreviations used in the dictionary. The tag defines an abbreviation and contains two types of tags: (k stands for key-phrase) The abbreviated text. (v stands for value) The full text. Note that there may be more than one per to specify synonyms like "Ave." and "Av.", but tag can be only one. Stands for article. This tag groups together all the stuff related to one key-phrase. The following tags are allowed only in between tags. The key-phrase. In general, there would be one tag per . However, some existing dictionaries with nested structure have several keys per article, and those tags are interleaved with tags. For new dictionaries, it is recommended to have only one tag per article, except cases when the key-phrases are synonyms (like "disc" and "disk") and they are all specified before the first tag. Note that dictionary parser puts EOL character after this tag. Thus, creators of XDXF dictionaries should not place EOL character after this tag to avoid duplication. Marks the optional part of the key-phrase. This tag marks a definition or a group of definitions which fall into a certain category. For English language those categories could by parts of speech. For example: noun, verb, adverb, etc. Note that tags can be nested. Programs may show them as indented paragraphs in a similar way as C++ programs are indented with curly braces. The tag is optional. If the article is simple and there is nothing to group - don't use it. [TODO: TBD] Note that dictionary parser treats this tag as
in HTML. Thus, creators of XDXF dictionaries should take this into considerations when formating the articles. specifies the Part Of Speech like: noun, verb, adverb, etc. Note that dictionary parser puts EOL character after this tag. Thus, creators of XDXF dictionaries should not place EOL character after this tag to avoid duplication. The tense. For example: past, present, future, past participle, etc. Marks transcription of the key-phrase. This tag marks Direct Translation of the key-phrase to another language. Reference to another key-phrase, which is located in the same file. Reference to a Resource file, which is located in the same folder. Optional attributes are necessary for audio and video files, when the reference points to a certain part of a large file. The attribute "start" specifies position in the file of the first byte of the chunk of interest, and "size" specifies its length in bytes. If the "start" attribute is omitted then it is assumed that it is 0. If the "size" attribute is omitted then it is assumed that the file should be played up to the end. Reference to an Internet resource. Marks an abbreviation that is listed in the section. ... (c c stands for color code) Marks text with a given color. The syntax for "c" attribute is the same as for "color" attribute of "font" tag in HTML. If the color attribute is omitted, the default color is implied. The default color is chosen by the parser. Marks the text of an example. (usually shown in a different color by the program) Marks the text of an editorial comment. (comments are usually shown in a different color by the program) Marks a sub-article. Sub-articles are used to represent nested articles. [TODO: Add more description, and an example] ------------------------------------------------------------------------------------ Non-XDXF Tags: ------------------------------------------------------------------------------------ Ideally XDXF would have only the tags that specifies the logical structure of the text and not the appearance. It would be up to the program to decide how a certain element would look like, and that appearance would be customizable by the user. However, in the real world we have dictionaries which already have some visual markings, like and , and often it is impossible for a converting script to determine with which logical markings they should be substituted. Therefore, in between the innermost tags, the following XHTML tags are allowed for visual marking: , , , , , , ,
------------------------------------------------------------------------------------ Example: ------------------------------------------------------------------------------------ Webster's Unabridged Dictionary ENG ENG explanatory, transcription Webster's Unabridged Dictionary published 1913 by the C. G. Merriam Co... n. noun v. verb Av.Ave.Avenue The Unite States of America Соединенные Штаты Америки record n. [re'kord] Anything written down and preserved. v. [reko'rd] To write down for future use. home [ho:um] n. sounds_of_words.ogg One's own dwelling place; the house in which one lives. One's native land; the place or country in which one dwells. The abiding place of the affections. For without hearts there is no home. дом at home - дома, у себя; make yourself at home - будьте как дома XDXF Home page: http://xdxf.sourceforge.net See also: home-made indices Plural form of word index disc disk n. A flat, circular plate; as, a disk of metal or paper. ------------------------------------------------------------------------------------ * Note: The example above has some extra spaces and \n symbols. They are added here for better visualization. XDXF files are intended to be viewed not in a text editor but a dictionary program. You can preview XDXF articles in the reference XDXF viewer http://xdxf.sourceforge.net/viewer/ All XDXF compliant dictionaries must show articles exactly as the reference XDXF viewer does.