What's Taters, Precious? - Parsing Potato Chat with iLEAPP

I was recently made aware of an iOS chat application called “Potato Chat,” for which no tool offered processing at the time (the app is now supported by several commercial providers).
In any case, my interest was piqued, and I installed the app on my iPhone 7 Plus test device, as I was able to jailbreak it and thus create a full file system backup without commercial tools.
Bruno Fischer, who is also a contributor to LEAPPs, helped me create test data. He will work on the Android version of the app and provide an ALEAPP parser in due course.
Through online research, I discovered that investigations into this app had already been conducted. DFIR examiner Forrest Cook (Whee30) has already addressed the app in two blog posts:
forrestcook.net - Potato Chat
forrestcook.net - Potato Chat 2, starchy boogaloo 🥔💬
He also provides his own Python scripts for processing binary blobs on Github:
His work helped me immensely in understanding the references created by the app, and when preparing group chats, I was able to use parts of his code with slight adjustments for interpreting blobs.
More on that later.
Potato Chat
On iOS, the app can be conveniently downloaded from the App Store (Adam ID 1204726898). The Android version is not available in the Play Store – here, the APK file must be downloaded from the developer's website and installed manually.
The main features of the app include:
- User-to-user chats
- “Secret chats” that are not synchronized with the web service
- Group chats
- Calls (within chats)
In addition, other features are offered, but these are not the focus of the current investigation:
- Sending and requesting payments
- Installation and use of additional apps and games via an internal app store
Data acquisition
I first collected some app information directly from the device. This allowed me to trace the bundle ID “org.potatochat.PotatoEnterprise” via the UFADE report. It was also apparent that the “Documents” directory was not shared with the user.
More detailed information could be obtained directly via Pymobiledevice3:
python3 pymobiledevice3 apps query org.potatochat.PotatoEnterprise
Output (abridged):
"CFBundleDisplayName": "Potato",
"CFBundleShortVersionString": "2.47.207050",
"CFBundleURLName": "org.telegram.TelegramHD",
"Container": "/private/var/mobile/Containers/Data/Application/C701A36E-63F1-4F6C-B579-59EB3AB2B9D2"
"GroupContainers":
{"group..ph.potatochat.Potato": "/private/var/mobile/Containers/Shared/AppGroup/7CF11843-0DC7-461A-84C2-7F33310EAEB0",
"group.org.potatochat.PotatoEnterprise": "/private/var/mobile/Containers/Shared/AppGroup/D2734A7E-41BD-4DDD-801E-C5623C27A927"
},
The practical aspect of this approach is that the relevant app directories are output directly, eliminating the need to search the backup for the appropriate app GUID. I plan to provide this information via a UFADE app report in the future.
The actual user data was found in the directory: /private/var/mobile/Containers/Shared/AppGroup/D2734A7E-41BD-4DDD-801E-C5623C27A927.
All data considered by the created parser is located here in the Documents subdirectory. This is only included in a complete file system image. There is also a Caches directory, which apparently contains pairs of image data and an associated thumbnail. These files were also found in a PRFS backup. The relationship between the images and the app will also be discussed later.
Locations
In Forrest Cook's blog articles, the naming of files and their storage locations has already been discussed extensively.
Therefore, here is just a summarized overview of the locations within the Documents directory:
| Artifact | Location |
|---|---|
| App database | tgdata.db |
| Group/Channel database (identifiers) | shareDialogList.db |
| Videos | ./video/remote<video-id>.mov |
| Images | ./files/image-remote-<image-id>/files |
| Other files | ./files/<file-id>/file,./files/<file-id>/<file-name> |
| Other files (from Secret Chats) | ./files/local<file-id>/file,./files/local<file-id>/<file-name> |
When looking at the tables in tgdata.db, I noticed that the table names have changed since Cook's research. At the time of his observations, the tables still had the suffix “_v32”, but this has now changed to “_v33”. This is taken into account in the parsing process, and the current table name for the table to be examined is queried first.
The following tables appear to be particularly relevant:
| Table | Content |
|---|---|
messages_v33 | Chat and secret chat messages |
channel_messages_v33 | Channels and group chats |
users_v33 | Known users |
contacts_v33 | Users in the contact list |
convesations_v33 [sic!] | Overview of individual chats |
media_cache_v33 | References to media and media types |
Processing
I initially took a “low hanging fruit” approach and worked through the messages_v33 table. Some of the messages here are simply in plain text:

However, what stood out were empty message fields that displayed text in the app. These were messages that contained both media and text. In this case, the text is located in the binary blob of the data column.
An example:

This binary blob belongs to a message from Bruno (who calls himself Anette in the chat) with a sent image (screenshot).
The following content can be extracted here:
| HEX value | Content |
|---|---|
59 D9 77 68 50 61 A0 39 | Reference to the image ID in little endian . /files/image-remote-39a061506877d959/image.jpg |
34 31 35 32 34 32 35 38 35 34 34 31 33 35 36 30 31 35 33 | Reference to the image ID in big-endian as ASCII: ASCII: 4152425854413560153From decimal to hex: 39a061506877d959 |
F0 9F 98 B3 | Accompanying text for the image in UTF-8. Here: 😳 |
The ASCII representation of the image identifier appears in the table media_cache_v33 as follows:
date | media_type | media_id | mids |
|---|---|---|---|
1754973231 | 2 | 4152425854413560153 | 0b 00 00 00 |
→ date
stands for a UNIX timestamp and the date: “Tue Aug 12 2025 04:33:51 GMT+0000”
→ media_type
describes the type of media being referenced. The following types have been assigned based on observations:
1- Video2- Image (including GIF)3- Other files (including audio)
→ mids
is available as a binary blob. Converted from little endian to a decimal value, this gives the message ID of the message with the image. Here, the message ID is 11. This number also appears in messages_v33 in the mid column.
This allows the message to be assigned to the media file, and the file path can be reconstructed using the media type according to the scheme described above. This worked very reliably in several tests, as all entries in the media_cache_v33 table could also be found as files in the Documents directory.
The user IDs of the conversation participants, the direction of the message, and the UNIX timestamp could also be read from the messages_v33 table. In short, everything you need to reconstruct a conversation history. The user IDs can be assigned names (or pseudonyms) via the users_v33 table.
This worked reliably until we started chatting in a “secret chat.”
In a “normal” chat, Anette had the user ID 96325535. In Secret Chat, the ID is now -2147483648. Message IDs, which are otherwise stored as consecutive one- or two-digit numbers, are also listed as values such as: -2147483622. Here, only the last three digits seem to change – however, this makes it impossible to assign them to user names or media.
The table convesations_v33 (yes, with the typo) provides a remedy here. The specified user ID (-2147483648), which is also the chat ID for individual chats, is used as the key value here. The participants column again contains data in binary format:

→ From the little-endian value: 9F CF BD 05 interpreted as a decimal number, we get the original user ID again: 96325535.
For the “universally valid” assignment of media files to chats, I use a small workaround. Only the entries in the media_caches_v33 table are of interest. Other entries are available in blobs, but there is no file that could be assigned to them. So all entries in the media_caches_v33 table are simply recorded as an array and iterated over the array for each binary blob to see if the little endian value of the media ID is contained. For image and video data, the ASCII string is usually also contained in the blob. However, other media can only be reliably assigned via the LE equivalent.
Calls
Potato Chat allows audio and video calls, which can be initiated from the respective chats and thus also appear as entries in messages_v33. The following binary blob belongs to a video call:

→ The HEX value 8B E2 67 11 18 was found in all call entries.
→ 27 at address 00000010 could be assigned to a video call. The following values were traced:
01and23for voice calls05and27for video calls
→ The last four bytes 3C 01 00 00 indicate the call duration in seconds as a little-endian value. In this case, 316 seconds (5 minutes and 16 seconds)
Contacts
Sent contact data is also available as a blob:

→ The HEX value 63 56 0A B9 was found in all call entries.
→ E0 F6 BD 05 is again the little-endian equivalent of the user ID, as already observed in the convesations_v33 table.
Coordinates
Depending on whether a location is being sent or received, the blobs differ slightly.
However, the structure of the first 32 bytes is identical:

→ The HEX value 6E D0 9E 0C was found in all entries with location data.
→ A3 6D 1C 7E A7 42 4A 40 is the latitude value in little endian. Converted to double: 52.5207364691553
→ 7D 3B C7 BF A0 D1 2A 40 is the longitude value in little endian. Converted to double: 13.409429543562277
A Google Maps search for these coordinates shows the Berlin TV tower:

This also matches the location shown in the app.
If the location was sent by the user themselves, the coordinates are followed by a Bplist with additional information from the location service “Foursquare.”

The information contained is already recognizable in the ASCII string. However, if the plist is extracted, it can also be read in a structured manner. This is an NSKeyedArchiver plist, in which the values are not directly available as key/value pairs, but are assigned via references (CF$UID):


Thankfully, the get_plist_content function from iLEAPP takes care of assigning keys and values. The value of venueId → 4c263a3cf1272d7fe72386c5 can be transferred via the Foursquare web service after registration as a URL query.
Call: https://foursquare.com/v/4c263a3cf1272d7fe72386c5

Responses
Responses to messages can be identified by the ASCII string replyMessage:

→ The message ID of the message being referenced is stored as follows: 01 69 02 <4 byte little endian message ID> 02 - Here: 0D 00 00 00 = message ID 13
→ The original message is contained again in the blob - Here Such mal Hash Land
→ The actual reply is again located in the message column
Group chats
Forrest Cook has already examined the structure of group chats in detail: forrestcook.net - Potato Chat 2, starchy boogaloo 🥔💬. As a result, I was able to integrate parts of his scripts into an iLEAPP parser with only a few adjustments.
While individual chats can be read directly in the SQLite browser in some cases, all content in group chats is stored as binary blobs.
The following columns of channel_messages_v33 appear to be relevant here:
cid → Chat ID as a little-endian signed integer supplemented by 4 bytes
mid → Message ID
data → Binary blob with content
Calculating the chat ID of the group or channel is not trivial, but it is also not really necessary, as the group ID is also contained in the data blob.
Cook's script systematically processes the respective blob, reads (ASCII) identifiers for the following content, determines the length of the content, and finally displays the respective content:
| Identifier | Content | Comments |
|---|---|---|
i | Message ID | Little Endian |
sk | Group ID | The ID is calculated from the first four bytes |
out | Outgoing | 1=True, 0=False |
fi | User ID | Little Endian |
t | Message (text) | Errors occurred here when interpreting longer messages. Cook interprets the length specification of the string as an int, but already assumes that it must be a VarInt value. Interpreted as VarInt, the message is displayed correctly. |
d | Unix timestamp | Little Endian |
md | Media references | In some cases, comparable blob data can be observed, as in individual chats |
As with individual chats, users could be assigned to their IDs via the users_v33 table. However, group/channel names cannot be found in tgdata.db. As Cook noted in his blog post, these are stored in another database called shareDialogList.db. This database contains only one table named share_dialog_list_users_v29. This numbering has not changed since Cook's research. The following values were found for the typeId column:
1 → Contacts (also found in tgdata.db)2 → Groups/Channels
The contents of the userInfosJson column are (surprise!) in JSON format:
Example:
{
“groupId” : 84273181,
“typeId” : 3,
“title” : “My group”,
“fileUrl” : “2_-5575343228968248862_252685026_4611130009646035738_1”,
“falgs” : 16794752,
“accessHash” : 7527350728923374105
},
Output
I processed the collected findings in an iLEAPP parser (potatoChat.py) and was able to reliably prepare the chats in the expected form for the test data at hand.



Further findings
As mentioned at the beginning, the Caches folder is not entirely irrelevant. For example, it also contains the user images of chat partners. Unfortunately, it has not yet been possible to determine whether these images follow a pattern in their naming or whether they are hashed names.
In some cases, identical image data was found that not only shows the same content but is actually bit-identical:

Preview images of the geo-coordinates from the chats were found in the directory ./Documents/tempcache_v1/store.
Example - file name 8ff73b9b278416570be48f70e2426267:

The coordinates determined previously can be found in the file data.mdb under ./Documents/tempcache_v1/meta:

Not all questions have been answered yet. Further artifacts can certainly be identified. If you have any suggestions or ideas for improvement, please don't hesitate to contact me 😉
