• Deutsch
  • English
  • Deutsch
  • English
  • Home
  • Blog
  • UFADE

    • What is UFADE?
    • Installation
    • Connect devices
    • Navigation
    • Reporting
    • Extraction
    • Logging
    • Developer options
    • Advanced Options
    • Data Operations
  • Testpoints
  • Legal

Far over the misty mountains cold - iLEAPP Threema Parser

iLEAPP and Threema

Over Christmas, a family member asked me for a little help with data migration between two iPhones.
This also involved transferring chats from the Threema application from one device to the other.
Although this was not possible as desired (without setting up the second device again), a new iLEAPP parser at least made the chats readable.

Threema

The paid app can be purchased from the App Store (Adam ID 578665578).

Threema is essentially a chat application with E2E encryption and a focus on data protection and privacy. The app's code is even open source: Github. Developed in Switzerland, Threema is mainly used in German-speaking countries. The Swiss army and administration rely on Threema, as do various German politicians. But the application is also considered an (data protection) secure alternative to WhatsApp internationally. Recently, the app was featured in the Darcy Jenson case. Three Australian citizens are alleged to have received instructions from a “Mr. Big” via Threema to murder Australian Zivan Radmanovic on the Indonesian island of Bali.

The fact that no phone number or email address is required for registration contributes to the anonymity of using Threema.

So far, I have only looked at the iOS version. An analysis of the Android version may follow later.


Data backup

Conveniently, the Threema database is not only included in an FFS backup, but also in an iTunes backup (and thus in a UFADE PRFS backup). User data is located (as is often the case with iOS applications) in the “Shared” directory:

/private/var/mobile/Containers/Shared/AppGroup/{uuid}

Outsourced media files are located in the hidden subfolder: .ThreemaData_SUPPORT/_EXTERNAL_DATA/

Special thanks go to my colleague Bruno Fischer, who provided me with several innocuous data sets to test my parser and create this documentation.


Sources

User data is almost exclusively stored in the database: ThreemaData.sqlite. Thankfully, almost everything here (compared to PotatoChat) is readable in plain text and logically structured.

The following tables are particularly interesting:

TableContent
ZMESSAGEChats and group chats (messages)
ZCONVERSATIONChats and group chats (overview)
ZCONTACTKnown users
Z_PRIMARYKEYOverview of message types
ZAUDIODATAContent sent as audio (m4a)
ZIMAGEDATAContent sent as images (jpg)
ZVIDEODATAContent sent as videos (mp4)
ZFILEDATAContent sent as files (original file format)

Processing

The actual messages and most of the content can be found in the ZMESSAGE table. Texts are stored in plain text.

The following columns of the ZMESSAGE table were used for processing the messages:

ColumnContent
Z_PKPrimary key - unique assignment of a message entry
ZDATETimestamp of the message in Cocoa format (seconds since January 1, 2001)
ZISOWN1 = message sent by yourself, 0 = message from another user
ZSENDERZ_PK from ZCONTACT if it is a group chat message
ZREAD1 = message has been read, 0 = message has not been read
ZTEXTActual message
ZCAPTIONText associated with a media file
Z_ENTMessage type - mapped with Z_ENT from Z_PRIMARYKEY
ZFILENAMEFile name if the attachment was sent as a file
ZMIMETYPEMimetype (e.g., “image/jpeg”) if the attachment was sent as a file
ZAUDIOZ_PK from ZAUDIODATA if the file was sent as audio
ZIMAGEZ_PK from ZIMAGEDATA if the file was sent as an image
ZVIDEOZ_PK from ZVIDEODATA if the file was sent as a video
ZLONGITUDELongitude for sent locations
ZLATITUDELatitude for sent locations
ZACCURACYAccuracy of the location data

Media

The primary keys stored in the ZMESSAGE table enable individual media to be mapped to specific messages:

ZMESSAGE.ZAUDIOZMESSAGE.ZIMAGEZMESSAGE.ZVIDEOZMESSAGE.Z_PK
Mapping:ZAUDIODATA.Z_PKZIMAGEDATA.Z_PKZVIDEODATA.Z_PKZFILEDATA.ZMESSAGE
Medium / Reference:ZAUDIODATA.ZDATAZIMAGEDATA.ZDATAZVIDEODATA.ZDATAZFILEDATA.ZDATA

The media could be found as blobs in the ZDATA column. However, a distinction is made here:

First byte is 01: The rest of this blob contains the actual media file in binary form. The following image shows the typical JPEG header (FF D8 FF E0):

Byte is 01

First byte is 02: The ASCII string between the leading 02 and the trailing 00 byte contains a UUID as a reference to the media file (in the image: F61847CA-3769-457E-A345-BB1A09C651A1):

Byte is 02

A file with this name can be found under .ThreemaData_SUPPORT/_EXTERNAL_DATA/. All files in this directory have no file extension, regardless of media type:

Image referenced

SQL Query

Now, the insights gained just need to be translated into processing logic that is as universal as possible. The following SQL query serves as the data basis for iLEAPP:

SELECT
    m.Z_PK AS MID,
    m.ZDATE as COCOA,
    conv.Z_PK as C_ID,
    m.ZISOWN as OUT,
    CASE
        WHEN conv.ZGROUPNAME IS NOT NULL
            THEN conv.ZGROUPNAME
        WHEN cont.ZFIRSTNAME IS NOT NULL AND cont.ZLASTNAME IS NULL
            THEN cont.ZFIRSTNAME
        WHEN cont.ZFIRSTNAME IS NOT NULL AND cont.ZLASTNAME IS NOT NULL
            THEN cont.ZFIRSTNAME || ' ' || cont.ZLASTNAME
        WHEN cont.ZFIRSTNAME IS NULL AND cont.ZLASTNAME IS NOT NULL
            THEN cont.ZLASTNAME
        ELSE cont.ZPUBLICNICKNAME
    END AS CHAT,

    CASE
        WHEN m.ZISOWN = 1 THEN
            'local user'

        WHEN m.ZSENDER IS NOT NULL THEN
            CASE
                WHEN sd.ZFIRSTNAME IS NOT NULL AND sd.ZLASTNAME IS NULL
                    THEN sd.ZFIRSTNAME
                WHEN sd.ZFIRSTNAME IS NOT NULL AND sd.ZLASTNAME IS NOT NULL
                    THEN sd.ZFIRSTNAME || ' ' || sd.ZLASTNAME
                WHEN sd.ZFIRSTNAME IS NULL AND sd.ZLASTNAME IS NOT NULL
                    THEN sd.ZLASTNAME
                ELSE sd.ZPUBLICNICKNAME
            END

        ELSE
            CASE
                WHEN conv.ZGROUPNAME IS NOT NULL
                    THEN conv.ZGROUPNAME
                WHEN cont.ZFIRSTNAME IS NOT NULL AND cont.ZLASTNAME IS NULL
                    THEN cont.ZFIRSTNAME
                WHEN cont.ZFIRSTNAME IS NOT NULL AND cont.ZLASTNAME IS NOT NULL
                    THEN cont.ZFIRSTNAME || ' ' || cont.ZLASTNAME
                WHEN cont.ZFIRSTNAME IS NULL AND cont.ZLASTNAME IS NOT NULL
                    THEN cont.ZLASTNAME
                ELSE cont.ZPUBLICNICKNAME
            END
    END AS SENDER,
    CASE
        WHEN m.ZISOWN = 0 THEN
            'local user'
        ELSE
            CASE
                WHEN conv.ZGROUPNAME IS NOT NULL
                    THEN conv.ZGROUPNAME
                WHEN cont.ZFIRSTNAME IS NOT NULL AND cont.ZLASTNAME IS NULL
                    THEN cont.ZFIRSTNAME
                WHEN cont.ZFIRSTNAME IS NOT NULL AND cont.ZLASTNAME IS NOT NULL
                    THEN cont.ZFIRSTNAME || ' ' || cont.ZLASTNAME
                WHEN cont.ZFIRSTNAME IS NULL AND cont.ZLASTNAME IS NOT NULL
                    THEN cont.ZLASTNAME
                ELSE cont.ZPUBLICNICKNAME
            END 
        END AS RECEIVER,		
    m.ZREAD as "READ",
    p.Z_NAME as "MTYPE",
    CASE 
        WHEN m.ZTEXT IS NOT NULL THEN m.ZTEXT
        WHEN m.ZCAPTION IS NOT NULL THEN m.ZCAPTION
        ELSE NULL
    END AS MESSAGE,
    CASE
        WHEN ad.ZDATA IS NOT NULL THEN ad.ZDATA
        WHEN vd.ZDATA IS NOT NULL THEN vd.ZDATA
        WHEN id.ZDATA IS NOT NULL THEN id.ZDATA
        WHEN fd.ZDATA IS NOT NULL THEN fd.ZDATA
        ELSE NULL
    END AS MEDIA,
    CASE
        WHEN tn.ZDATA IS NOT NULL THEN tn.ZDATA
        ELSE NULL
    END AS THUMBNAIL,
    m.ZFILENAME AS FILENAME,
    m.ZMIMETYPE AS MIMETYPE,
    m.ZLATITUDE AS LATITUDE,
    m.ZLONGITUDE AS LONGITUDE,
    m.ZACCURACY AS ACCURACY
FROM
    ZMESSAGE m
JOIN
    Z_PRIMARYKEY p
    ON m.Z_ENT = p.Z_ENT
LEFT JOIN ZAUDIODATA ad ON ad.Z_PK = m.ZAUDIO
LEFT JOIN ZVIDEODATA vd ON vd.Z_PK = m.ZVIDEO
LEFT JOIN ZIMAGEDATA id ON id.Z_PK = m.ZIMAGE
LEFT JOIN ZFILEDATA fd ON fd.ZMESSAGE = m.Z_PK
LEFT JOIN ZIMAGEDATA tn ON tn.Z_PK = m.ZTHUMBNAIL
LEFT JOIN ZCONTACT sd ON sd.Z_PK = m.ZSENDER
LEFT JOIN
    ZCONVERSATION conv
    ON m.ZCONVERSATION = conv.Z_PK
LEFT JOIN
    ZCONTACT cont
    ON conv.ZCONTACT = cont.Z_PK;

Output

As announced at the beginning, I created an iLEAPP parser (Threema.py) and was able to reliably prepare the chats in the expected form for the test data at hand. This parser already takes into account the upcoming release of LAVA and already contains the necessary parameters for display as a “Conversation”:

HTML-Report

HTML view

LAVA

LAVA view


UFADE Adjustments

At the beginning, I explained that the Threema database (but without WAL and SHM) is also included in an iTunes backup. This makes it much easier to generate test data, but it also revealed a previously unknown (and rare) problem with the decryption logic used by UFADE in the iOSbackup module.

Bruno provided me with Threema data from FFS and PRFS backups. It was striking that the PRFS Threema databases were declared “malformed,” while the FFS databases opened without any problems. Comparisons of two databases from the same device showed that the PRFS data was too small here – the last approx. 100kb of the database were missing.

When reviewing the iOSbackup code, I identified a truncate() function as the “culprit,” which trims each output file to the specified size from manifest.db. If this function is commented out, the PRFS backup created afterwards also contains a readable Threema database. Since the missing truncate can cause problems with other data types (e.g., plist), a workaround has been integrated into UFADE that only prevents truncate for SQLite databases after checking the file size.