PDA

View Full Version : Extract contents from a binary Dictionary file - Help Request



NetProwler
07-09-2011, 02:33 AM
A brief introduction to the problem: One of our clients has an in-house search facility in her large portal site - which is also triggered when there is a 404 - to 'improve' the user experience. Some pages use French, Spanish and Latin words/expressions as part of the content. Eg: Caveat Emptor, Carpe Diem,raison d'etre etc

The search script uses a look-up table to discern the meaning of the searched word in case it can't find appropriate content. For other languages, we want to use a language dictionary to populate the look-up table. I know it sounds very convoluted and complicated. But we have to do what we need to do. We downloaded one French-English dictionary for this project. But the dictionary is in the form of an executable file which upon installation reduces to an executable binary file and what looks like a compressed form of word definitions in binary format. I tried disassembler, Hex Editors to extract the word definitions to no avail. I can't find any ASCII string containing words.

I had done this exercise about 7 years ago for another project but I can't remember what Perl script I wrote to extract from a similar file. Any pointers ?

I would really appreciate any help.

deepsand
07-09-2011, 10:27 PM
Given that the data could be compressed using a proprietary method, or encrypted, there's no one good answer.

In any case, though, the answer lies in the executable.

Are the data in question in their own file, or contained within the EXE?

If the former, what is the EXT of the data file? And, have you tried decompressing it with any of the usual suspects?

If the latter, you'll need to begin by disassembling enough of the EXE to locate the boundaries of the data.

NetProwler
07-10-2011, 05:33 AM
Thanks for the reply deepsand. The dictionary lives in a separate file with the extension of .abs. I came to this conclusion as the executable file is just about 850 k while the data file is about 5 Mb. I tried the usual utilities to decompress this file and all utilities agreed on one thing - it is not an archive. Any more ideas ?

deepsand
07-10-2011, 06:27 PM
The abs extention is used for at least 10 different file formats, most of them really oddballs.

Have you tried any of the various abs viewers?

NetProwler
07-12-2011, 12:01 AM
Thanks again deepsand. I have tried all possible viewers. The file simply refuses to disclose its contents in ASCII format under any tools I tried.

deepsand
07-12-2011, 01:06 AM
Sounds like a proprietary scheme. Which, unfortunately, means disassembling the EXE.

NetProwler
07-13-2011, 04:13 AM
Yes. I don't have the time and patience to delve into that now. I would have to look for a simpler language Dictionary which itself is daunting.
Thanks deepsand.