Linearize tmx on a translation unit basis
1.
Find
<tu
(it has to include at the beginning all the white space preceding the <tu and one extra space after it)
Replace
<tu
(one extra space after it)
2.
Find
\n \s+
(regular expressions on)
Replace with nothing
The result would be a list of one liners witch each line having a full TU.
Why do I do that? Because most tmx editors cannot handle big TMs and TM/TMX editors suck when it comes to big files. I use EmEditor instead where for example I can search for invalid/corrupt characters, bookmark those TUs and then batch delete them.