Apsic xbench utility

#Apsic xbench utility license
#Apsic xbench utility free

You are free to distribute and modify the code, as long as any significant modifications or derived works are also made available under the GPL terms. It is free for personal use (use by freelance translators for their work is considered personal use).

#Apsic xbench utility license

LF Aligner is distributed under the GNU General Public License version 3 or newer. if you want to get started quickly without reading the whole thing, you can do so by following the steps described in sample/howto.txt, but you should probably come back to this readme later, especially if you get stuck with something. I kept adding information and this readme ended up being pretty long. Just open aligner_setup.txt to see the main setup options. LF Aligner also gives you complete control over the whole process: in the TMX, you can set the date and time, language codes, creator ID, add notes to each segment etc., and you have extensive customisation options regarding a bunch of other features, too. Tab delimited txt files are always generated as well, suitable for use with Apsic Xbench or processing with other tools. The primary output is TMX, but if you don't use TMX-compatible software, the aligner can generate xls files for you. You can check the log to see if this dictionary data was used for your alignment.) (Reasonably good dictionary data is bundled with LF Aligner for more than 800 combinations of 32 languages. The accuracy of Hunalign's automatic pairings depends entirely on the quality of the source material (whether you have removed page headers and footers etc.) and whether it has a good dictionary to work with, but percentages in the high nineties are common.

Most of the time you will get a very usable TM without human input. The upshot is that you don't have to manually pair up the segments, only review the pairings and do any necessary corrections - or not even that. It uses a smart algorithm to determine which sentence goes with which, relying on sentence length, a dictionary and, as near as I can tell, black magic, and it does a really good job. The reason why you may want to use this simple tool instead of the flashy and complicated aligners from the big players is Hunalign. The aligner also has other features like creating TMX files and downloading EU legislation or any other bilingual HTML webpage for alignment (see details on the web features further down). LF Aligner also has a couple of features designed for larger-scale corpus building, such as handling huge data sets, built-in data filtering, batch mode, automatic segmentation evaluation and unattended operation. I wrote it to make what is probably the best open source automatic sentence aligning algorithm, Hunalign (see ) more convenient to use. LF Aligner is intended for translators who wish to create translation memories from translations made without a CAT tool or from any other text that is available in two or more languages.

Input files and how they are handled, tagged formats, running in perl

Advanced tips: the built-in sentence splitter (segmenter), using your CAT for segmentation, Hunalign, GUI Batch alignment using command line arguments Downloading EU and other documents from the web, language codes Input files, notes on doc, docx, rtf and pdf, basic instructions Contact: the latest release from the original source: