PV2 Firmware Analysis

Pure Digital's Ritz Camera Dakota PV2 LCD variant

[camera pic]



BOOTLOADER
It appears that there is a small bootloader program built in to the ASIC. Although I'm not sure of the exact size, it could be up to 0x2000, 0x4000, or (unlikely) 0xB000 bytes.  The purpose of this program is to initialize the SDRAM memory, load the file FIRMWARE.BIN from the FLASH memory, verify its checksum, and begin execution. While the bootloader is probably hard-coded into a ROM on the ASIC, or it could be in a small rewritable memory.  The idea behind the bootloader is to provide a small, simple, reliable piece of code that could be put in more expensive and less-often-changed memory so that the bulk of the main (and larger) program can be loaded from cheaper memory.

The bootloader appears to perform a checksum of FIRMWARE.BIN. I was able to swap the bytes at offset $03825 and $03811 and still pass; this means it is not a complicated position-dependent checksum like CRC or md5. Since these addresses have the same remainders when divided by 16, 8, 4, and 2, the checksum could be adding words of 1, 2, 4, 8, or 16 bytes (note; this can be independent of the size of the checksum). I'll rule out 8 and 16 byte words because they are unusually large and don't provide any benefit for such a simple algorithm.



00000-1efff
2's comp
1's comp

00000-1f1ff
2's comp
1's comp

0-1f1fd 2's comp
1's comp
Sum of bytes

00cb4c38 ff34b3c8 ff34b3c7
00cb5200 ff34ae00 ff34adff
00cb50d8 ff34af28 ff34af27
Sum of 2-byte words (msb first)

661cdeee 99e32112 99e32111
661fcec9 99e03137 99e03136

661ef679 99e10987 99e10986
Sum of 2-byte words (lsb first)

65faa54a 9a055ab6 9a055ab5
65fd8337 9a027cc9 9a027cc8
65fd325f 9a02cda1 9a02cda0
Sum of 4-byte words (msb first)

a02371bd
5fdc8e43
5fdc8e42
a5e65bd7
5a19a429
5a19a428

n/a

Sum of 4-byte words (lsb first)

afd2288c
502dd774 502dd773

cbbcea90
34431570
3443156f

n/a

Xor of bytes

      10       f0       ef
      10       f0       ef
  98 68   67
Xor of 2-byte words (msb first)
    1e0e     e1f2     e1f1
    f7e7     0819     0818
2fb7 d049 d048
Xor of 2-byte words (lsb first)
    0e1e     f1e2     f1e1
    e7f7     1809     1808
b72f 48d1 48d0
Xor of 4-byte words (msb first)
86cf98c1 7930673f 7930673e
5c72ab95 a38d546b a38d546a

n/a

Xor of 4-byte words (lsb first)
c198cf86 3e67307a 3e673079
95ab725c 6a548da4 6a548da3

n/a


Two sums look interesting: The 00/00/ff's sum of bytes over addresses 0-1f1ff is a nice even round number and a traditional checksum. However, the 50d8 as the sum of bytes over 0-1f1fd is much more interesting because the two bytes immediately following this range (and the last two bytes of the file) are $d8 and $50.

Bytes 3fea-3fff of FIRMWARE.BIN contain similar bytes to the signature seen at the end. The value stored where I think the checksum is (the 3ffe-3fff) is $a4, $4b. A calculation of the checksum for the bytes before (0..3ffd) yields a sum of bytes of $001a4ba4, so it would appear that there are two checksums. When calculating them, the lower-address one must first be found before the upper address one can be found.

When the bootloader detects a bad checksum at $1f1fe, it will beep upon power-on or when the USB cable is connected. It's two mid-tone beeps in a row so that it sounds like a ~1 second beep with a studder. When connected over USB, a short high-tone beep presumably means it has been enumerated by the bus. It still responds with the same vendor/product ID, but the available pipes seem different (more investigation needed). Hopefully this is a way to upload new code and fix a broken FIRMWARE.BIN.

I haven't tried to see what happens when the checksum at $3ffe is invalid, so it may not even be checked.

FIRMWARE.BIN - Overview

This file is not encrypted. It could have been, but that might mean an expensive change to the ASIC. It would have made it harder to understand what's going on, but when good security is used, obfuscating the code is useless (for an example, see openssh).

File size is 0x1f200 (124.5KB).  There is about 97KB of 'real' non-repetitive data in the file. The serial number is not recorded in this file (it's in NVRAM.DAT), but that doesn't mean that the serial number doesn't contain some digits that identify the firmware version.

The processor is an 8-bit core made by ARC International named the ARClite. This processor was originally developed by Vautomation as the "V8 microRISC" before being bought by ARC. The only commercially available development tools available seem to be HI-TECH software. If their ARClite C Cross Compiler was used to generate the code (which I strongly suspect), it looks pretty good. I haven't gotten far enough to see evidence of a real-time operating system, but HI-TECH also makes one named Salvo.

It is a very nice, small core, similar to many other 8-bit processors. The Wayback Machine's archive of Vautomation's old website provides the most detailed publicly available information on the chip.  A lot of the rarely used or easily emulated features of other processors (such as half-carry, zero page, BCD math, and ASL) is gone, and in their place are a generous 8 registers. This means that it can add two 32-bit numbers relatively quickly, and since variables don't need to be schlepped to and from memory as often, code is usually smaller and faster. But, the coolest thing is that this processor has two user-defined instructions. These could be implemented in hardware to do any number of things, from critical parts of JPEG compression, to crypto function, to DSP processing. I didn't see either of these instructions used in the PV2's firmware, though.

The processor has a 16-bit program counter, meaning it can access at most 64KB of memory at one time.  There are noticeable breaks in the file at locations (hex) 1000, 4000, 7000, a000, d000, 10000, 13000, 16000, 19000, 1c000, and 1f000. The first 1000 bytes are kept in one bank (at address 0000-0FFF), while the other pages of memory are swapped in to the memory range 1000-3FFF.  Each bank contains a copy of the run-time routines (such as 16-bit multiply and 32-bit divide) that are used by the C compiler -- this explains why there is so much repetition. The left-over space at the end of each bank is filled with 0x30 -- often times people will fill this with a reset or break instruction to stop a run-away program during a crash, but the code at the beginning of each bank seems to reinitialize this to zero.

I wrote a little utility to analyze repeats in the code. Basically, it looks at 8-byte chunks and searchs the entire file to see if it's repeated anywhere. The results are interesting. Unique blocks are indicated with a ".", blocks with 2 copies are marked with a "2", and so on, through "A" (10) to "Z" (36), and to "*" (>36). You can clearly see the filler bytes at the end of each block ("*"), plus some common code at the beginning ("BBB").

At the end of the file are a seemingly-extra 0x200 bytes. They don't appear to be referenced by the code (so far), but it look like a signature. It describes the version of the FIRMWARE.BIN file & probably includes a checksum. It may also provide some basic variables for the bootloader to use if the firmware has been corrupted.

0001f000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
(more repeated 00's)
...
0001f1e0  00 00 00 00 00 00 00 00  00 00 06 27 00 ca 0d 02  |...........'....|
0001f1f0  10 64 af 40 91 40 03 00  16 00 4c 61 4d 53 d8 50  |.d.@.@....LaMS.P|
1f1ea:   06 = Hardware ID
1f1eb: 0027 = Type ID (also camera's USB Product ID)
1f1ed: 0dca = SMaL's USB Vendor ID
1f1ef:   02 = firmware type
1f1f0: 6410 = firmware version
1f1f2: 40af = ?  \ these contain many of the same
1f1f4: 4091 = ?  / digits found in serial numbers
1f1f6: 0003 = ?
1f1f8: 0016 = ?
1f1fa: LaMS = SMaL backwards
1f1fe: 50d8 = checksum: sum of bytes over 0-1f1fd

This echoes the bytes at the end of the first bank of code, except the last two bytes are different:

03fea: 06 27 00 ca 0d 02

03ff4: 10 64 af 40 91 40 03 00 16 00 4c 61 4d 53 a4 4b

That means that I can probably figure out the meaning of the appended bytes if I can see how the first-bank bytes are used.

DISV8
To help me analyze the file, I'm writing a disassembler.  The basic idea is to translate hex bytes back into assembly language instructions. This gets us a lot of the way to understanding what is happening, but, since most of these assembly language instructions are generated by a C compiler, they are not as elegant or as easy-to-understand as hand-written assembly. (It looks like the BIOS of the VMU was hand-written, while the original Dakota camera firmware was also compiled C). So, I've incorporated a bit of a simulator into the disassembler -- it can remember register contents and provide comments that summarize what groups of instructions are doing.

The disassembler is mostly done and has been stable for a few months. It seems to have a problem calculating CRC's for some functions, but this isn't worth fixing yet. If there are errors that prevent it from being ported to other operating systems, I'll definitely fix them.

DISV8's Simulator-Generated Comments

A simple example of the simulator' is basic initialization -- the program remembers the value of R0 and indicates what it will be:
CLR R0
STA R0,$f100       ;($f100) = $00

A slightly longer version that is a little trickier would be:
CLR R0
STA R0,$f100       ;($f100) = $00
INC R0
STA R0,$1234       ;($1234) = $01

Here's some even trickier code. It basically does "R5:R4 = ($e1d):($e1c) & 00:03":
018c: e2 03    +29  L018C:   LDI R2,#$03
018e: e3 00    +30           LDI R3,#$00
0190: e8 1c0e  +31           LDA R0,$0e1c
0193: 22       +32           AND R2
0194: 72       +33           T0X R2
0195: e8 1d0e  +34           LDA R0,$0e1d
0198: 23       +35           AND R3
0199: 73       +36           T0X R3             ;R3 = $00
019a: 82       +37           PSH R2
019b: 8c       +38           POP R4
019c: 83       +39           PSH R3             ;Push $00
019d: 8d       +40           POP R5

Notice the comments at $0199 and $019c? Because $e1d is AND'ed with a $00, the result will always be $00. This is a pretty good tip-off that a compiler was used; depending on the usage (if the value 00:03 isn't likely to change), a good assembly language programmer would have replaced the code with:

L018C:   LDA R0,$0e1c
        
LDI R4,#$03
         AND R4
         T0X R4
         LDI R5,#$00

The automatic commenting helps provide a little more information so that I don't have to look over the code carefully whenever I ask "did the compiler really just do that?"

Incidently, the compiler-generated code could have been made simpler by casting the variable at e1d:e1c to an 8-bit value before anding, and then re-casting back into a 16-bit value. But, that's painful to type in C and since the code works, there is no feedback from the compiler that it generated unnecessary code. Ultimately, though, the compiler could have done a better job.

DISV8 Comment File - FIRMWARE.COMMENTS

The disassembler is designed to help respect the copyrights of the analyzed code's authors. I couldn't distribute my commentary to the original Dakota camera's code because my comments had been mixed in with the disassembled instructions. DISV8 automatically merges in comments from a separate publicly-distributable comments-only file -- this file can be generated by hand, or easily stripped out from an annotated listing using a small utility.

Here's a portion of an example file:
**{--------------------------------------------- F66D2A24.000F8 ----
** Function: memset
** In:    R2 = value
**     R4:R3 = start address
**      $e18 = count
**
** ;memset: set ($e18) bytes at R4:R3 to R2
**}-----------------------------------------------------------------

**{--------------------------------------------- 037DB68B.009E1 ----
** Function: main
** I/O:
**
** Entry point after clearing BSS
**}-----------------------------------------------------------------
+20 //($f72e) &= $f8
+28 //($f71b) &= $fe
+35 //($f71b) |= $40
+88 //e13:e12:e11:e10 = 00:01:00:00, used by L020F
+89 //R2 = $00, byte for memset called by L020F

The first line of each function header (starting with "**{") contains a CRC checksum for the function and the original location. If a subsequent version of firmware moves a function, or if a function is repeated, the CRC will remain the same and the comments moved to the appropriate location.  This information isn't totally used by disv8 yet, so the comments are still very firmware version dependent.

Function memset has a special feature -- the line with a semicolon in it will be an automatically appended comment to any other function that calls this - this makes those functions much easier to read.

Not only are comments allowed in function headers, but they also attach to individual lines of assembly code. The main function does a lot of initialization -- most of the registers that get initialized with constants are automatically recognized by the disassembler, but some of the common C evaluators "&=" and "|=" generate harder-to-analyze code -- so, I've commented the code by hand:

0a07: e1 f8    +17           LDI R1,#$f8

0a09: e8 2ef7  +18           LDA R0,$f72e
0a0c: 21       +19           AND R1
0a0d: c8 2ef7  +20           STA R0,$f72e       //($f72e) &= $f8

A much fancier disassembler would do this automatically, or at least generate "($f72e) = ($f72e) & $f8". It's a tradeoff -- this is pretty easy to identify by hand compared to the work it would take to make the computer recognize it. Hopefully this is a relatively rare case.

The comments file also defines entry points (with the correct bank to use) and known variable names (such as the temporary registers used by the C library code)
.


DISV8 & FIRMWARE.COMMENTS Download

Both of these files are available in this beta release (June 21, 2005). It's an xcode project, but since it's basically a command-line gcc-built tool, it should work on any platform. BlueDonkey.org has kindly provided a makefile. Drmn4ea has also ported a version of disv8 to windows, but you'll need to download my mac version to scavage the latest version of FIRMWARE.COMMENTS.

The June 21st FIRMWARE.COMMENTS contains additional disassembly by BillW and a few changes by me. The disassembler has also been modified to handle structures (like HeaderBuf) better.

The included version of FIRMWARE.COMMENTS describes version 6410 of the firmware. This appears to be the oldest version -- it was included on the first camera I took apart.  These cameras are still available -- I bought two with this version in late December, 2004.  I tilted the odds in my favor by looking for the dustiest cameras on the shelf. This comments file is entirely my creation and contains no camera code. The firmware is not included (sorry, I can't send copies out) but even without the firmware, you can see the routines that I've figured out.
 
Some interesting statistics about my disassembly:
... all that from just a bunch of numbers found in the memory chip!



My main PV2 analysis page
other systems I've played with
visit my homepage