Author Topic: Convert blobs in an Access database into text  (Read 986 times)

spiros

  • Administrator
  • Hero Member
  • *****
  • Posts: 293795
  • Gender: Male
  • point d’amour
    • spiros.doikas
    • greektranslator
    • doikas
    • 102094522373850556729
    • lavagraph
    • Greek translator CV
Convert blobs in an Access database into text
« on: 07 Feb, 2013, 13:58:27 »
Convert blobs in an Access database into text

I had an issue with a database with text encoded as blobs (all I could see was "long binary data" in each cell) so I had to find a solution. Just a note here: if your blobs are in effect images or other type of non-textual binary data then the proposed solution will simply give you a string of meaningless characters.

One easy way is to open the .mdb database with a good text editor (like Notepad++, I personally use EmEditor but it is commercial). Make sure you guess the encoding right (UTF-8 would be a good one to try). What you might get is text separated with little square symbols. If this is good enough you can copy the text. If this is not good enough, then you need to follow a different procedure.

Step 1. Convert .mdb database to MySQL database. Use the free tool Access To MySQL
Step 2. Use a tool like PhpMyAdmin (available in most server configurations or in local server installs like Wamp).

Construct your query
SELECT CAST(column_name AS CHAR CHARACTER SET utf8) FROM table_name

Replace column_name and table_name with the respective column and table of the blobs.
Run your query. Use export as csv to export the results.

Similarly, you can use this type of query to export directly from MySQL. For example, if you want to get the text from mediawiki articles, try something like this:

Export mediawiki articles from blob to text
Code: [Select]
SELECT CAST(old_text AS CHAR CHARACTER SET utf8) FROM text
Magic provided by courtesy of the MySQL command CAST().
« Last Edit: 07 Feb, 2013, 22:20:12 by spiros »