Still with tmx format from DGT, you can use some other concordancer and create word lists, usage frequency etc. Still good to have. The TAUS is the best so far and in combination with the Sketch engine you can create a fairly thoroughly corpus that will be a combination of Internet crawling and European terminology. With Sketch you can create a corpus on the fly based on the words you want to search for. I am going through my links because I swear I thought there was some service that you could submit a domain, language, even sample text and have them create a corpus from the web. If I find it again I will post it here.