Tuesday, April 25, 2017

Babylon GLS file sample

  •  GLS Files 

    The GLS glossary source file contains the terms (glossary) and their definitions comprising the glossary. 
    The main body of the glossary source text file consists of a collection of entries called terms.
    These are the words and phrases defined in the glossary. The term is the primary key (index) of the glossary.

    Glossary Terms Section contains the elements that comprise the data of the glossary. Each term element contains a word or a phrase (the glossary term) and a definition. It may also include alternate forms and other elements that control the term's behavior and graphical display.
     GLS Terms Syntax

    In the GLS file, each term or glossary entry has the following basic structure:

    [blank line]
    Term | Alternate1 | Alternate2| ... | AlternateK
    [blank line]

      Text Formatting

    Definitions may contain HTML tags to enhance their formatting and display in Babylon-client.
    In addition to standard HTML tags, the following tags can also be used in the glossary source Files:
    Alternate Name

     How to generate an updated glossary from an old GLS?

    Babylon Glossary Builder will open GLS files as data sources. The GLS file will transform them into GPR files. The GPR file include a reference to the GLS file, and will keep the GLS glossary ID, to prevent duplicate glossaries. 
     Start a new glossary project.
  • Fill all the glossary properties in the General glossary properties as described in section 3.
  • Upload your GLS file and build the glossary, save your glossary project for further use.
  • This will create GPR and BGL file from your GLS file. This glossary file will contain all the GLS terms, alternates and definitions, and the same glossary ID.
    If you wish to edit the GLS file terms, please refer to Syntax GLS section.
GLS format
GLS file format is an alternative, where the raw data is in simple text format. GLS file needs to be converted into binary BGL format so that GoldenDict can read it. Convertion can be done with new or old version of the Babylon Builder tool (see above). The format of the file is described in babylon page.
[blank line]
Term | Alternate1 | Alternate2| … | AlternateK
[blank line]
An example is here below:
### Glossary title:ITU-T Security Dictionary Test
### Author:Lauri Säisä
### Description:Security related terms and definitions used in ITU-T recommendations.
### Source language:English
### Source alphabet:Default
### Target language:English
### Target alphabet:Default
### Icon:
### Browsing enabled?Yes
### Type of glossary:00000000
### Case sensitive words?0

### Glossary section:
Ensuring that authorized users have access to information and associated<BR>assets when required.<BR>X.1051/2004
Ensuring that information is accessible only to those authorized to have<BR>access.<BR>X.1051/2004
Safeguarding the accuracy and completeness of information and processing<BR>methods<BR>X.1051/2004
PDCA model
The model, known as the &quot;<U>Plan-Do-Check-Act</U>&quot; (<B>PDCA</B>) model, can be applied to all ISMS processes, as adopted in this Recommendation. Figure 1 illustrates how an ISMS takes as input the information security requirements and expectations of the telecommunications organizations and interested parties associated with the telecommunication sector and through the necessary actions<BR>and processes produces information security outcomes (i.e., managed information security) that meet those requirements and expectations.<BR><img src=”c:\ftpserver\blogdesk\figures\pdca.png” align=”center”><BR><BR><I>ITU-T Rec. X.1051 (07/2004)</I>
statement of applicability
Document describing the control objectives and controls that are relevant and applicable to the organization’s ISMS, based on the results and conclusions of the risk assessment and risk treatment processes.<BR>X.1051/2004

### Glossary title:Contacts
### Author:
### Description:Contacts
### Source language:Dutch
### Source alphabet:Latin
### Target language:Dutch
### Target alphabet:Latin
### Icon:
### Icon2:
### Browsing enabled?Yes
### Type of glossary:00000000
### Case sensitive words?0
; DO NOT EDIT THE NEXT **SIX** LINES  - Babylon-Builder generated text !!!!!!
### Glossary id:029f645f6877899f836e9c869d8a89447c6e8e9e8271769a9d659559957889978372869c8477772a9693897f5d89526e7c458944648127444854575a428a9244977c946e5b524147584c559fcc23264eac62515a414f4b8b279a224f5a42474f5f555638993bd86e5b524147584c559f908e2d60475e5a4b597e66833a32294d48429a4d5955
### Confirmation string:7C221QRF
### File build number:01292D7C
### Build:
### Glossary settings:00000000
### Gls type:00000001
; DO NOT EDIT THE PREVIOUS **SIX** LINES  - Babylon-Builder generated text !!!!!!
### Part of speech table:
### Private label id:
### Min version:0
### Regular expression:
### Glossary section:

DSL format as alternative
GoldenDict can read *.dsl files (used by ABBYY Lingvo tool), this could be used as an alternative format. In fact, this is easier format to maintain since GoldenDict can read this text format directly. Tips on how to create dsl files are available in GoldenDict forum pages (more). A sample dsl file (plain text) is available here.
There also exist a perl script TXT2DSL (direct link) to quickly convert simple data format file into dsl format. The tool expect each line to be in following format (tabulator is used to separate fields).
word <tab> translation <tab> explanation <tab> comment
The following example shows how dsl file looks like.
#NAME "My English Dictionary"
/ [s]abacusBr.wav[/s] 'æb?k?s; [p]NAmE[/p] [s]abacusUS.wav[/s] / [c]noun[/c] [s]abacus.jpg[/s]
([p]pl.[/p] [c darkcyan][b]aba·cuses[/b][/c] / [s]abacusesBR.wav[/s] -k?s?z; [p]NAmE[/p] [s]abacusesUS.wav[/s] /) a frame with small balls which slide along wires. It is used as a tool or toy for counting.

[m1][b][c]acquiesces[/b][i],[/i] [b]acquiescing[/b][i],[/i] [b]acquiesced[/b][/m]
[m1][trn]If you [b]acquiesce[/b] in something, you agree to do what someone wants or to accept what they do. \[FORMAL\][/trn][/m]
[m2][*][ex][lang id=1033]\[[i][c]V in/to [p]n[/c][/i][/p]\] Steve seemed to acquiesce in the decision…[/lang][/ex][/*][/m]
[m2][*][ex][lang id=1033]\[[i][c]V in/to [p]n[/c][/i][/p]\] He has gradually acquiesced to the demands of the opposition…[/lang][/ex][/*][/m]
[m2][*][ex][lang id=1033]When her mother suggested that she should not go far from the hotel, Alice willingly acquiesced.[/lang][/ex][/*][/m]
[m1][*][com]give in[/com], [com]submit[/com][/*][/m]
[m1]If you [b]acquiesce[/b] in something, you agree to do what someone wants or to accept what they do. \[FORMAL\][/m]
[m2][ex][c darkgray]\[[i][c]V in/to [p]n[/c][/i][/p]\] Steve seemed to acquiesce in the decision…[/c][/ex][/m]
[m1][com][c]give in[/c][/com][/m]
[m1][b][c]embeds[/b][i],[/i] [b]embedding[/b][i],[/i] [b]embedded[/c][/b][/m]
[m1]1) [p][i][trn]VERB[/i][/p] If an object [b]embeds[/b] itself in a substance or thing, it becomes fixed there firmly and deeply.[/trn][/m]
[m2][*][ex][lang id=1033]\[[i][c]V [p]n[/p] in [p]n[/c][/i][/p]\] One of the bullets passed through Andrea’s chest before embedding itself in a wall [url]http://www.google.com[/url].. \[[i][c]Also V [p]n[/p] [p]prep[/c][/i][/p]\][/lang][/ex][/*][/m]
[m2][*][b]Derived words:[/b][/*][/m]
[m2][*][b][lang id=1033]embedded[/b] [i][c]ADJ-GRADED[/c][/i] [i][c]oft ADJ in [p]n[/c][/i][/p] [ex]The fossils at Dinosaur Cove are embedded in hard sandstones…[/ex] [ex]There is glass embedded in the cut.[/lang][/ex][/*][/m]
[m1]2) [p][i][c][trn]VERB:[/p] usu [p]passive[/c][/i][/p] If something such as an attitude or feeling [b]is embedded[/b] in a society or system, or in someone’s personality, it becomes a permanent and noticeable feature of it.[/trn][/m]
[m2][*][ex][lang id=1033]\[[i][c]be [p]V-ed[/p] in [p]n[/c][/i][/p]\] This agreement will be embedded in a state treaty to be signed soon by Bonn and East Berlin.[/lang][/ex][/*][/m]
[m2][*][b]Derived words:[/b][/*][/m]
[m2][*][b][lang id=1033]embedded[/b] [i][c]ADJ-GRADED[/c][/i] [i][c]oft ADJ in [p]n[/c][/i][/p] [ex]I think that hatred of the other is deeply embedded in our society.[/lang][/ex][/*][/m]

1. DSL tag
I found that DSL format can be very useful to make a user dictionary, for it's in uncompiled plain text format. So I can make changes anytime I want and put in new articles incrementally.
Then when I want to make the article richer and more readable, dsl format's markup tags are necessary. I found some tags formats are supported in dsl from Lingvo's help file.

Code: Select all
[b],[/b] - boldfaced font
[i ],[/i] - italics
[u],[/u] - underlined font
[c],[/c] - coloured (highlighted) font
[mN],[/m]- the left paragraph margin. N is the number of spaces(0-9).
[s],[/s] - multimedia zone (used to add pictures or sound files into a dictionary entries ).
[url],[/url] - link to a Web page.
[p],[/p] - labels (clicking a label displays its full text)
[ref],[/ref]- hyperlink to a card in the same dictionary (or <<, >>)
[sub][/sub] - subscript
[sup][/sup] -  superscript
['],[/'] - a stressed vowel in a word.
[ex], [/ex] - examples zone.
[*], [/*]  - the text between these tags is only displayed in full translation mode
[trn], [/trn] - translations zone.
[com], [/com] - comments zone.
[!trs], [/!trs] - the text between these tags will not be indexed

I tested and confirmed that above tags are internally recognized in GD.
Some Lingvo tags seem to do nothing in GD.
These tags seem to have no references in article-style.css and don't show any recognizable effect.

Code: Select all
[*], [/*]  - the text between these tags is only displayed in full translation mode
[trn], [/trn] - translations zone.
[com], [/com] - comments zone.
[!trs], [/!trs] - the text between these tags will not be indexed

If I misunderstand something, please let me know it.

2. representative headword for multiple ones
In DSL dictionary, I want to make only one headword appear for several synonyms.
That is, assume several headwords(ex. yi, いち, 일, 一) have one article, I want to make GD show like this even when I search "yi", いち or 일)
Code: Select all

one, single; individual; undivided