Microsoft glossaries to Trados memory conversion tool updated to convert unicode

spiros · 11 · 22848

spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
The tool in question is MSGloss2TWB (download link).

MSGloss2TWB:  Software localizers, especially those working with programs that run on the Microsoft Windows platform, often consult with the glossaries that Microsoft makes available freely through its ftp site.  There are literally hundreds of files in every language that the Microsoft products are localized into.  However, the format of these files, csv, makes it hard for some users to work with them.  This program should be of interest to anyone who wants to use these repositories within the Trados Translators Workbench: It is a very easy to use program that converts any number of these Microsoft files into the TWB format.  Once that is done, you should be able to directly import these files as a Trados translation memory.

To install, click on the link below to download, unzip the file to a temporary location on your computer and run setup.  Setup may not run unless you first unzip the files.
http://www.globalready.net/downloads.shtml


So far it could cope with Microsoft glossaries up to the previous version (2003) but it would not convert the latest update of Microsoft glossaries for languages like Greek and Japanese which are unicode encoded. Now this problem has been fixed and it can succusfully convert Greek and Japanese unicode Microsoft glossaries to Trados memory format.

Links

Exhaustive list of translation tools in Translatum
ApSIC Xbench (Similar Microsoft glossaries→Trados -and other formats- conversion tool)
CSVConverter by Heartsome 
CSVConverter is a free utility that converts glossaries stored in CSV (Comma Separated Values) to TMX (Translation Memory eXchange) standard. TMX files generated with CSVConverter are compatible with all major CAT tools. You can use CSVConverter to generate valid TMX documents from Microsoft Glossaries, Microsoft Excel, OpenOffice Calc or custom CSV files

Access to Microsoft glossaries is not public anymore and restricted to people with MSDN developer membership.
« Last Edit: 27 Apr, 2023, 00:48:48 by spiros »


wings

  • Global Moderator
  • Hero Member
  • *****
    • Posts: 73947
    • Gender:Female
  • Vicky Papaprodromou
Μπράβο, Σπύρο. Μ' έσωσες γιατί είχα τα γλωσσάρια μήνες τώρα και καθόμουν και τα κοίταζα λόγω του προβλήματος. Το ίδιο πρόβλημα βέβαια υπάρχει και με το πρόγραμμα MS Glossaries Browser.

Ο λόγος είναι μεγάλη ανάγκη της ψυχής. (Γιώργος Ιωάννου)



spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
Να είσαι καλά! Επίκεινται και άλλες αλλαγές, έχω πει στους κατασκευαστές τόσο του MSGloss2TWB όσο και του ApSIC Xbench Client να βελτιώσουν την εξαγωγή προσθέτοντας τα πεδία με τις πληροφορίες προέλευσης (π.χ. να βλέπεις στη μνήμη από ποιο αρχείο και απο ποιο πρόγραμμα είναι η κάθε ανεύρεση).


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
From an interesting article by JOST ZETZSCHE

Translation Memories: The Discovery of Assets
Recognizing opportunities and overcoming obstacles to TM sharing


Microsoft has been the great visionary in this respect. No other company in any other industry that I am aware of has published TMs as extensively and as widely used. Throughout the last 12 years, Microsoft has invested significantly into publishing a large portion of its TMs into 44 languages. Its motive for doing this and for regularly updating these databases is to promote consistent terminology in an industry where Microsoft is certainly one of the leaders in terms of innovation and sheer size. The Microsoft glossaries (actually a misnomer because they are in fact TMs) are easily the most widely used reference materials for any Windows-based software localization project today. And while it is difficult to estimate the success of the Microsoft glossaries in measurable unification of software terminology, it is no coincidence that "File," "Edit," "View" and "Help" are generally translated the same across languages for Windows-based applications.

"The publication of our glossaries has been a success story for Microsoft from beginning to end," says Ursula Schwalbach, team lead of one of Microsoft's terminology management groups, based in Redmond, Washington. "We've achieved a remarkable degree of standardization for Windows-based terminology that has resulted in greater clarity for users. And in the process we feel that we've made a contribution toward greater understanding and clarity in the development and translation community. In fact, this success has led us to begin developing an improved and more user-friendly distribution model."

How are the Microsoft glossaries actually used? The market has provided some of those answers. At least four tools currently available are specifically geared toward working with the large CSV files that the Microsoft memories are published in. One of the tools converts the Microsoft memories into a TRADOS-specific text format and another into TMX for easy use as TMs; the other two are index or search tools that are primarily geared toward terminology searches. This means that while the Microsoft TMs are used as TMs (and it is important to note that their CSV formats make them prime candidates to be imported into a great number of TM systems), they are also used as invaluable terminology repositories.



spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
I got this today:

Dear Beta Tester,
 
We have just released a new version of our freeware terminology reference tool, ApSIC Xbench 2.7, and it is available for download from our website at http://www.apsic.com/en/downloads.aspx.
 
This version includes many bug fixes and a number of new features, such as:
 
- A Powersearch mode, that alows you to perform Google-like searches for example to locate strings that do not contain a particular string or that contain one of two or more strings.
 
- More supported formats, such as Multiterm XML, TBX, Mac OS software glossaries, or IBM Translation Manager folders in .fxp exported format.
 
- QA functionalities such as checking for nomeric consistency, consistency in repetitions, and other checks.
 
- The ability to define checklists to perform several repetitive checks in batch
 
- And some other bug fixes and performance enhancements here and there, such as the ability to zoom by file and performance improvements for searches returning hundreds of thousands of matches.
 
Please try the new version at your earliest convenience and please do not hesitate to provide us feedback, such as suggestions or bug reports at: http://www.apsic.com/en/products_submit_bug.aspx.
 
I hope you enjoy playing with the new version of ApSIC Xbench.


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
Just got this email, a major bug with not correctly copying the Greek encoding was fixed and now I fully recommend this free utility to Greek translators:

Quote
Hi Spiros,

In http://www.apsic.com/en/downloads.aspx there is a new build (2.7.283) that we believe that fixes most of not all the encoding problems (en pressing Enter to copy the target text, and/or when trying to move non-Latin 1 text to and from the Xbench controls.

Please try it at your earliest convenience and feel free to report any problems left with greek characters encoding.

BTW, also in build 2.7.283, in Tools→Settings→Misc. Settings, there is a new option to disable the fuzzy warning.
« Last Edit: 19 Feb, 2007, 18:07:09 by spiros »


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
New build of Apsic Xbench:

We have just posted a new public build of ApSIC Xbench 2.7, namely build 214. You can download the latest version from http://www.apsic.com/en/downloads.aspx.

Among a number of fixes, the main enhancements from build 183 are these:

Support for more formats with Edit Source. Now you can open the editor directly at the segment that corresponds to the highlighted entry for search results and QA for the following additional formats: IBM Translation Manager, Trados Word uncleaned, Tab-delimited text, Trados exported memory. This addition extends this useful feature that earlier was only available for Trados TagEditor and SDLX files.
Context information for Trados exported memories. Now search results show segment attributes and users for Trados memories.
Long name support for IBM Translation Manager folders. Now file and folder names for IBM Translation Manager appear in long format.
Support for changing the font for search and QA results. Now the font can be changed for search and QA results, which allows you to adjust the font size to your monitor requirements.
Addition of QA check “source=target”. A new check has been added to QA. Now you can get a list of all source segments who match the target segment (potential untranslated segments). By default, this new setting is turned off.
See Context feature is now also available from the QA results. The QA results now allows you to see segments surrounding the selected entry using the See Context function. This feature was already available from search results and now is also available from QA results.
Checklist entries can be sorted alphabetically. Now you can sort alphabetically the entries of a project or personal checklist simply clicking one button.
« Last Edit: 07 Apr, 2008, 14:39:33 by spiros »


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
A bug report has been submitted whereby only ANSI tab delimited files are imported correctly and not UTF-8. (This problem does not apply to csv files of Microsoft glossaries which are UTF-8 encoded).
« Last Edit: 08 Jan, 2012, 12:36:56 by spiros »


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
ApSIC Xbench 2.8 BETA (Build 385)

New support for Regular Expressions and Microsoft Word Wildcards. Now it is possible to search and add check list items using regular expressions grammar or Microsoft Word wildcards. This allows you to specify very powerful search expressions that we believe that will allow you to reach a new level with QA searches. You can check out the power of regular expressions and learn how to use them by running the sample search templates provided against a large glossary such as a Microsoft Windows software glossary.
Faster search engine. The already fast search engine has been improved in the new search engine to become a 50% faster.
More supported formats. Now it also supports SDLX memories, Atril DejaVu and Idiom files, and Logoport RTF files.
Categories for checklist items. Now you can organize your checklist items in categories and run them selectively.
More fine-grained selection of segments to search. Now you can limit searches to only new segments, only ongoing translation or even exclude locked segments in search results.
—And many other enhancements and fixes!
You can download it now from http://www.apsic.com/en/downloads.aspx.
« Last Edit: 15 Nov, 2009, 18:36:43 by spiros »


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
ApSIC Xbench 2.9 BETA (Build 465)
http://www.apsic.com/en/downloads.aspx

More formats supported. A few more bilingual formats have been added. Now it is possible to load XLIFF files, Trados Studio files, PO files, MARTIF files, IBM TM Memories, Wordfast Pro TXML files and OpenTM2 files.
Folder tree support for most formats. Now you can load folder trees for all formats that do not require special file-dependant settings.
Revamped checklist engine. Now multiple personal lists can be run in a single QA pass and can also be linked to projects. A powerful inheritance engine has been built into checklists to reuse items from other checklists and override items as required.
Font support for Instructions tab. Now the Instructions tab can use any of the system fonts to allow for better presentation and enhanced support of non-Western languages.
And many other enhancements and fixes across the board!


spiros

  • Administrator
  • Hero Member
  • *****
    • Posts: 854548
    • Gender:Male
  • point d’amour
ApSIC Xbench v2.9 Refresh (build 474)

A new maintenance build for ApSIC Xbench v2.9 with the following main fixes:

Enhanced detection of XLIFF and SDLX XLIFF segment statuses.
Fixed corruption for Asian characters in Manage Checlistks dialogs.
Fixed See Context dialog for PO files.

…and 22 more subtle enhancements and fixes that we are too lazy to detail!


 

Search Tools