Yet another reverse engineering blog

Friday, November 23, 2007

Embiid Publishing

Short History

Embiid Publishing was an early e-pub company which started back in 2000. They published some midlist SF and Romance titles, most famous probably being Liaden series by Sharon Lee and Steve Miller. They offered nice prices ($5 in average), and free sampler bundles. At first their books were Windows-only, later they started to offer Rocket format and a book reader for Palm OS. In 2006 the company closed doors, leaving customers with books they could not convert.

The Reader

The Windows reader program could read two formats: UBK and EBK. The former was slightly scrambled but could be read by any reader. The latter was encrypted with a personalized key and could only be read by the personalized reader executable downloadable with the first purchase.
The Reader was written in Delphi and had pretty basic functionality: changeable font, bookmarks, navigation.

File format details

A pseudo-C description of the file header looks like following:
struct EmbiidFile {
/* 00 */ int32 file_seed; //the seed for decrypting header fields
/* 04 */ char type[5]; //file type (encrypted w/ file_seed)

#define FTYPE_UBK "Valid" //non-personalized (text encrypted with file_seed)
#define FTYPE_EBK "EBook" //personalized (text encrypted with user_seed)

/* 09 */ uint32 cover_off; //offset of the cover image (jpeg image)
/* 0D */ uint32 cover_len; //length of the cover image data
/* 11 */ byte version; //format version (encrypted w/ file_seed)


/* 12 */ char title[50]; //book title (encrypted w/ file_seed), space-padded
/* 44 */ char author[954]; //book author (encrypted w/ file_seed), space-padded
/* 3FE */ uint16 nchapters; //number of chapters
/* 400 */ uint32 chap_lens[256]; //chapter lengths
/* 800 */ char book_text[]; //text of the book. UBK: encrypted with file_seed, EBK: encrypted with user_seed

The encryption uses a 1024-byte array to xor the data with. The array is initialized from the seed using a pseudo-random number generator. Here's pseudocode for its generation:
float a = seed/1000.0
for(int i=1;i<0x400;i++)
float b = int(a/127773);
float c = a - b*127773;
a = c*16807 - b*2836;
if (a<0) a+=2147483647;
xor_buf[i] = int(a/2147483647*256)&0xFF;

The decryption uses the file offset of the data to index the array, and it skips bytes that would decrypt to 0x1A (the EOF symbol):

xor_val = xor_buf[file_offset%0x400];
val_out = val_in^xor_val;
if (val_out==0x1A)
val_out = val_in;

While the file_seed is stored directly in the file, user_seed is calculated as Adler32 checksum of a 128-byte user ID, which is stored directly in the personalized EmbiidReader.exe.

t1 = 1;
sum = 0;
i = 0;
t1 = (t1 + user_id[i++]) % 0xFFF1;
sum = (t1 + sum) % 0xFFF1;
while ( i <0x80 );
user_seed = (sum<<16) | t1;

The text of the book uses a small subset of HTML tags for formatting, but the paragraphs are delimited by newlines, not <br> or <p> tags.

Here's a small Python script to convert an Embiid book to HTML. A valid EmbiidReader.exe is necessary to decrypt personalized books.
Google Pages
You will need Python to run it.
Place your books, EmbiidReader.exe and into the same directory and execute from command prompt: <book.ebk>
You should get a <book.html> file with decoded text.


Anonymous said...

Any idea why your Python script would miss chunks of .ebk files? I've got a bunch of Sharon Lee & Steve Miller titles that I've bought, and would like to get them into a format I can more easily read. I'm buying them from Baen or Fictionwise as they become available, but not everything is available yet. :-(

Maxine said...

I am a newbie and also have Sharon Lee and Steve Miller Embiid books and would love to convert them to HTML.
How do I use the script to decode them?

Anonymous said...

Hi, I'm trying this but I can't find EmbiidReader.exe in my computer, where should it be located?? I can't find it for download on internet either, seems the publisher closed doors, so what can I do? I'm still able to watch the ebook, so it must be using soe program to view it, right?? thanks.

Igor Skochinsky said...

@Anonymous: download Process Explorer from TechNet, run it and look for EmbiidReader.exe in the process list. Double-click it to see the full path to the exe.

Anonymous said...

Hi Igor, no luck finding embbidreader. The ebook is actually an exe (Ebook Pro) so I'm not sure how it works and there is no trace of it using embiidreader when I open it. I got the the ebk file from a temp folder created when the ebook is opened, and copied it to the same folder as the python scrypt but am still missing embiidreader. Thanks.

Anonymous said...

hello igor
i have a ebk file which requires a software called imo bookshelf.can i convert this ebk file which runs with imo book shelf to pdf file using your instructions.and from where do i get embiid reader.exe

Anonymous said...

Hi Igor,
I try your instructions but I can't find EmbiidReader.exe. How can I get it?

Igor Skochinsky said...

My script only works for Embiid books. If you don't have EmbiidReaderrexe, your book is probably not from Embiid.