Audit your cloud data

Audits are easy with ZeroDark.cloud. First, all data is encrypted on the local device before being uploaded to the cloud. And the framework that performs this encryption is open-source, making it easy to inspect & verify. Second, you can perform an audit of the data that's stored in the cloud, allowing you to verify that your data is secure.

In this article, we'll walk through how you can perform an audit of ZeroDark.cloud. We'll show you how & where the client framework performs encryption. And we'll walk you through an audit of your cloud data.

 

Records & Data

Every node stored in the cloud is split into 2 components. The "record" and the "data". These are 2 separate files that get stored in the cloud. They're stored under the same filename, but have different file extensions. The record file has the extension ".rcrd", and the data file has the extension ".data".

The RCRD file

The record file contains only the treesystem metadata. It's a simple JSON file that contains the minimum information necessary for the server to handle the node.

Let's look at a real RCRD file, from my own bucket:

{
  "version": 3,
  "fileID": "978F7123EE7B4190BF65D23CC01CFFBB",
  "metadata": "uuhauyWYSvWyOhgeYl8ZuLZ3Tr3gU9Pk0e5y6jGodvxU7gIUVlILKWn8d63Nb0t/oN1ynfPU156prAvudVpsfWD4UsXCE8/nf/2xNxS6bI/XQUd9tELKlFJrMZmPe2pjVttCa1kF1hsVAPQ3bACApXDeJ8M=",
  "keys": {
    "UID:7g4jon8fea7pbmy68ezeaugdpukm5he7": {
      "perms": "rws",
      "key": "eyJ2ZXJzaW9uIjoxLCJlbmNvZGluZyI6IkN1cnZlNDE0MTciLCJrZXlJRCI6IjBMcUlMb05sWVFYcFhJbi9YcXpxdEE9PSIsImtleVN1aXRlIjoiVGhyZWVGaXNoLTUxMiIsIm1hYyI6InRQUWxZR0VPYk9NPSIsImVuY3J5cHRlZCI6Ik1JSEVCZ2xnaGtnQlpRTUVBZ01FZFRCekF3SUhBQUlCTkFJMEN1L2pRaHNXRHNTUm5VYUNqUDN1K1Y0VzM3Zy9xbzZOTXFpcytOMFU4aHFHSkJDbGxnRmppdnY3NGlWQ0RNRnlOR1Izd2dJMEg0cEZxWTFTWlYveVYwbnFkN2VOTzBQQVU0UGErRThFdGRUUHAyeFFmVTFEUk9XZ2lXK29JWGxWT2Y0SW8yem1BLzFCM3dSQXV2M1ErU0xMVFk5VHFOMVZvcTIrQlFDRUJkcnVzSkc5REU1ODZ0VUJZTjYrVFFDbU5DY0gyRmlsakVEa1lybTJTSnNQbjZJT0krY0xXeGVtWDVTY29BPT0ifQ=="
    }
  },
  "children": {
    "dir": {
      "prefix": "3D8E0A484DBC4DB7A5CD7D7F3669B3FE"
    }
  }
}

Here's what these fields mean:

  • version

This is just the version of the RCRD format itself. If we need to change the JSON structure in the future, this will get incremented.

  • fileID

The is an immutable UUID for the node. This value gets assigned by the server. It assists us in detecting when nodes have been moved or renamed. For example, if this file gets moved from path "/foo/bar" to "/buzz/lightyear", it will still have the same fileID, so we'll be able to detect that it was moved.

  • metadata

This section is encrypted, and must be decrypted in order to be read. In order to decrypt it, you'll need the node's encryptionKey. Every node has a different encryptionKey (randomly generated). So where is the encryptionKey? It's also encrypted, and stored in the keys section (which we discuss below).

If we decrypt the metadata section, we'll find the node's cleartext name (i.e. ZDCNode.name). This is because the actual filename (when stored in the cloud) is a hashed version of the cleartext name (with salt that comes from the parent node).

For example:

  • cleartext name: "The secret Coca-Cola recipe.txt"
  • cloud name: dcauqok66griorw7m7487itp3rtrceem.rcrd

This means the server cannot read filenames - it only sees hashes (with random salt). More details on how this works can be found in the encryption article.

  • keys

This stores all of the permissions. In this example, the user with userID "7g4jon8fea7pbmy68ezeaugdpukm5he7", has the following permissions:

  • (r)ead
  • (w)rite
  • (s)hare

The server will automatically send push notifications to all users in this list that have (r)ead permission. (Push notifications go out when the node is created, modified, moved/renamed or deleted. This applies to both the record & data file).

In addition to the permissions, this section includes a wrapped version of the node's encryptionKey. The term "wrapped" means that the node's encryptionKey is first encrypted (wrapped) using the user's publicKey. And it's this wrapped version that gets stored in the JSON. Therefore, to unwrap the node's encryptionKey requires the matching private key. And since user "7g4jon8fea7pbmy68ezeaugdpukm5he7" is the only person who knows his/her privateKey, only they can decrypt this blob.

  • children

For non-leaf nodes, a 'children' section may be present. This designates a dirPrefix to be used by child nodes. We'll be explaining this in more detail below when we start to audit the cloud.

You'll notice the record contains a mixture of encrypted & non-encrypted information. And the non-encrypted stuff encapsulates the minimum amount of information required by the server:

  • who has permission to modify this node ?
  • if the node is modified, who should I send push notifications to ?
  • if the node is deleted, where can I locate its children ?
Code Audit

Record files are created in ZDCCryptoTools.m, in a method called cloudRcrdForNode:. You can read the code yourself. Or, better yet, setup a breakpoint, and then try uploading a node. That way you can step through the code in the debugger.

 

The DATA file

The data file contains the content generated by your app, in an encrypted format. The unencrypted data is generated (by your app) via the following methods in your ZeroDarkCloudDelegate:

// These functions are part of the ZeroDarkCloudDelegate protocol.
// You implement then when integrating ZeroDark into your app.

// Framework is ready to upload a node's *.data file.
// It's asking you to supply the data section to upload.
// 
func data(for node: ZDCNode, at path: ZDCTreesystemPath, transaction: Any!) -> ZDCData {
   // Your app returns cleartext data here.
   // And ZeroDarkCloud handles the encryption & upload.
}

// Framework is ready to upload a node's *.data file.
// It's asking you to supply the metadata section to upload. (optional)
// 
func metadata(for node: ZDCNode, at path: ZDCTreesystemPath, transaction: Any!) -> ZDCData? {
   // Your app (optionally) returns cleartext data here.
   // And ZeroDarkCloud handles the encryption & upload.
}

// Framework is ready to upload a node's *.data file.
// It's asking you to supply the thumbnail section to upload. (optional)
// 
func thumbnail(for node: ZDCNode, at path: ZDCTreesystemPath, transaction: Any!) -> ZDCData? {
   // Your app (optionally) returns cleartext data here.
   // And ZeroDarkCloud handles the encryption & upload.
}

The data section is required, while the metadata & thumbnail sections are optional. And the return type, ZDCData, is a wrapper that allows you to return data from several different formats:

  • via in-memory-data
  • via a cleartext file
  • via an encrypted file
  • or as a promise (used to return data asynchronously)
Code Audit

This data sections are requested via the PushManager, in a method called "preparePutOperation". Depending on how you return your data, the PushManager will take different steps to achieve its goal, which is to combine all given sources into an encrypted format called "CloudFile format".

A CloudFile consists of the following sections:

  • header
  • metadata section (optional)
  • thumbnail section (optional)
  • data section

The CloudFile combines all these sections into a single file, and then encrypts the entire file using the node's encryptionKey. Further, each section can be downloaded independently via the DownloadManager or the ImageManager

To audit the client framework, follow the code starting in [PushManager preparePutOperation:]. After the framework fetches the cleartext data from its delegate, it will attempt to convert the data into CloudFile format. And, ultimately, this means it will use the Cleartext2CloudFileInputStream. If you set breakpoints in [Cleartext2CloudFileInputStream read:maxLength:], you can follow the flow of data as it gets encrypted for upload.

Audit the Cloud

To audit the information in the cloud, you'll need to install the AWS Command Line Interface. Follow the install instructions here.

Next, extract the audit credentials via ZeroDarkCloud framework. Add the following code to your application somewhere:

zdc.fetchAuditCredentials(localUserID) { (audit, error) in

  if let audit = audit {
    print("Audit:\n\(audit)")
  }
  else {
    print("Error fetching audit credentials: \(String(describing: error))")
  }
}

Notes:

  • You need to be logged into the account you want to audit
  • The fetchAuditCredentials function is only available when compiling the framework for DEBUG. It won't exist if you compiled the framework for RELEASE.

If all goes well it should print out something like this in your Xcode console:

Audit:
 - localUserID     : rry8eu1jjxhtspo3wgf1zhsumt35fshk
 - aws_region      : us-west-2
 - aws_bucket      : com.4th-a.user.rry8eu1jjxhtspo3wgf1zhsumt35fshk-8c51dd43
 - aws_accessKeyID : ASIA34QOC52MGH4RT4FI
 - aws_secret      : cypu5y4Rz3WKbdynuTcg6WmBlF6MmG0cZsjFGU8I
 - aws_session     : FQoGZXIvYXdzEOn//////////wEaDGQ5wTKG5ZAE7i/Q0CKEAk7Q1ujRmTchZFlwlUMQCdMmX8UEsL/Rt+s0CIt/+ehS/fCqcfNILTW4JZn9izsWKlwEZ7jYjw8HRpeVhf6QeeVU6xfqU1ofPR/V8nEqYtxHjnEcfFgvgbNsgfW/SUso8maT55UkYocqVnJCEgpa3OvvUzgqfoewqj2ZK0jrqr01ieFJ8PS0FiS3TfoMPhJMJgcyigP7OKo+q9kGv6G/CvgqVn1NWGGPFnYQQG1I/kZJnXQFEHLlcYHAvzjjXWypcC4wm7P2IHhe/BOrU7P3TH2OitvK1EVTnHYvByP4lSJtTx5eaNtX5ffkyB2fmedSubzdI4fXpxza99rrCQgchay7h9wBKKfk7e0F
 - aws_expiration  : Thursday, October 31, 2019 at 5:45:43 PM Pacific Daylight Time

Credentials are short lived — they're generally only valid for about an hour. If you need more time, just run your app again, and fetch fresh audit credentials again.

Once you have these credentials, you can configure aws cli:

$ aws configure --profile paste_localUserID_here

The command prompts will walk you through inputting the credentials for the AWS CLI. But unfortunately it has a BUG, and doesn't prompt you for the session. So you have to add that manually afterwards:

$ aws --profile paste_localUserID_here configure set aws_session_token paste_session_here

You can verify the information via:

$ cat ~/.aws/credentials

And now you're ready to inspect your S3 bucket. You can use various AWS CLI tools for this. But to get started, try this command:

$ aws --profile paste_localUserID_here s3api list-objects-v2 --bucket paste_bucket_here --delimiter /

(Be sure to make the replace the paste_X_here with the correct values for your user.)

Which should give you an output that looks something like this:

{
  "Contents": [
    {
      "Key": ".privKey",
      "LastModified": "2019-10-08T21:10:40.000Z",
      "ETag": "\"0d4a3c06c146c32ef0a56e05434f0bc7\"",
      "Size": 1344,
      "StorageClass": "STANDARD"
    },
    {
      "Key": ".pubKey",
      "LastModified": "2019-10-08T21:10:40.000Z",
      "ETag": "\"3d50c53a71418c2ae232a8a33592b4cd\"",
      "Size": 1169,
      "StorageClass": "STANDARD"
    }
  ],
  "CommonPrefixes": [
    {
      "Prefix": "com.4th-a.ZeroDarkTodo/"
    }
  ]
}

You can download any file in your bucket like this:

$ aws --profile paste_localUserID_here s3api get-object --bucket paste_bucket_here --key ".pubKey" "pubKey.json"

The above command would download the ".pubKey" file from the S3 bucket, and save it in your current directory, in a file named "pubKey.json".

The common files every bucket has are:

  • .pubKey — Everybody can download this from your bucket. It's a JSON file that contains your public key. The same public key that's stored in a smart contract on the blockchain.
  • .privKey — A wrapped (encrypted) version of your private key. The only way to get your real private key is to decrypt the content in this JSON file with the user's 256-bit access key. (Also, only your user has the proper S3 permissions to access to this file.)

For clarification, your ".privKey" file is a JSON file. And here's what you need to understand about it:

{
  "version": 1,
  "encoding": "Twofish-256",
  "keySuite": "Curve41417",
  "privKey": "The value here is NOT your private key. This is a wrapped (ENCRYPTED) version of your private key. And you must use your access key to DECRYPT it."
}

If you want to inspect the files within your app container, then you can use the treeID for your app. For example:

$ aws --profile paste_localUserID_here s3api list-objects-v2 --bucket paste_bucket_here --prefix "com.4th-a.ZeroDarkTodo/" --delimiter /

Which will give you output that looks something like this:

{
  "CommonPrefixes": [
    {
      "Prefix": "com.4th-a.ZeroDarkTodo/00000000000000000000000000000000/"
    },
    {
      "Prefix": "com.4th-a.ZeroDarkTodo/18183C7EA3DE4C8598C0F6AE99F216AA/"
    },
    {
      "Prefix": "com.4th-a.ZeroDarkTodo/8BD6E7844D424750BEE00669C03C7612/"
    },
    {
      "Prefix": "com.4th-a.ZeroDarkTodo/AC8F239139F2441AAF3207A49CBBA1A4/"
    },
    {
      "Prefix": "com.4th-a.ZeroDarkTodo/C1AB045062B442A68DC2DA51EFA5B36B/"
    }
  ]
}

All app-generated content will have a path using the format:

  • {treeID}/{dirPrefix}/{hashedFileName}.[rcrd, data]

This is called the CloudPath, which is different from the treesystem path. For example, a node may have the following paths:

  • treesystem path : /foo/bar/buzz/lightyear/toystory
  • cloud path: com.4th-a.ZeroDarkTodo/C1AB045062B442A68DC2DA51EFA5B36B/dcauqok66griorw7m7487itp3rtrceem.rcrd

As discussed in the encryption article, the node-name is hashed. So in the example above, the name "toystory" gets hashed (with dirSalt), and turned into "dcauqok66griorw7m7487itp3rtrceem".

In addition, we're storing files in AWS S3 — which is not a filesystem. It's actually a simple key/value store. From from the perspective of S3, our "path" is actually just a string — it's the "key", which just happens to have '/' characters in it. And S3 has limitations on the length of the key.

Thus we cannot simply hash all the names in a path:

  • /foo/bar/buzz/lightyear/toystory would be
  • /32_char_hash/32_char_hash/32_char_hash/32_char_hash/32_char_hash

Which eventually becomes too long for a key in S3. So instead we use the concept of a "dirPrefix":

  • Every node has a randomly generated dirPrefix
  • These are UUID's (32 characters, hexadecimal, 128 bits of entropy)
  • The root node is the only exception, which is hard-coded to be all zeros.
  • Thus all direct children of node X share the same /dirPrefix/

In our "toystory" node example above, the parent node ("lightyear") has a dirPrefix of C1AB045062B442A68DC2DA51EFA5B36B. And this system allows us to map /any/valid/treesystem/path into a valid cloudPath that works with S3.

You can get the cloudPath for any node in your treesystem via the CloudPathManager.

Back to the audit

The dirPrefix that's all zero's is the root directory. So you can inspect all the files in your root directory like so:

$ aws --profile paste_localUserID_here s3api list-objects-v2 --bucket paste_bucket_here --prefix "com.4th-a.ZeroDarkTodo/00000000000000000000000000000000/" --delimiter /

Which will give you output that looks something like this:

{
  "Contents": [
    {
      "Key": "com.4th-a.ZeroDarkTodo/00000000000000000000000000000000/6nx4o6ykoq7rd4isy4qsze3pcwftwrpn.data",
      "LastModified": "2019-10-08T21:21:44.000Z",
      "ETag": "\"967e064ace5b8df60aa3bea62acc6625\"",
      "Size": 128,
      "StorageClass": "STANDARD"
    },
    {
      "Key": "com.4th-a.ZeroDarkTodo/00000000000000000000000000000000/6nx4o6ykoq7rd4isy4qsze3pcwftwrpn.rcrd",
      "LastModified": "2019-10-08T21:21:43.000Z",
      "ETag": "\"90a2a7dd5161d295aed6b2d94576b12f\"",
      "Size": 926,
      "StorageClass": "STANDARD"
    }
  ]
}

We can see here that there's one node in the root directory (with both a RCRD & DATA file). Feel free to download the DATA file to verify that it's encrypted. You'll notice it's just a blob of unintelligible bytes. In particular, it's encrypted with Threefish 512, a tweakable block cipher. (Encryption is performed in Cleartext2CloudFileInputStream.)

Let's download the RCRD file:

$ aws --profile paste_localUserID_here s3api get-object --bucket paste_bucket_here --key "com.4th-a.ZeroDarkTodo/00000000000000000000000000000000/6nx4o6ykoq7rd4isy4qsze3pcwftwrpn.rcrd" "rcrd.json"

Again, this downloads the given file from S3, and stores it in your current directory. In this case we saved the file locally as "rcrd.json". And if we inspect the downloaded JSON file, we'll find something like this:

{
  "metadata": "i86LB9QVv8F+tbhLMy02kGW+XowqRznj28/absAuOfaNAQFrbPHijM3LotCxaLUzvHXaJuSIJVtS3XxGyNVytUfz1umQVXiwg1XelE8gBxJtO3C/CoDIye395hVpm1kG2ubXKA==",
  "keys": {
    "UID:rry8eu1jjxhtspo3wgf1zhsumt35fshk": {
      "perms": "rws",
      "key": "ewogICAgInZlcnNpb24iOiAxLAogICAgImVuY29kaW5nIjogIkN1cnZlNDE0MTciLAogICAgImtleUlEIjogIjdrK2ljYVRtaFhNd2xJTVNFRGpTSXc9PSIsCiAgICAia2V5U3VpdGUiOiAiVGhyZWVGaXNoLTUxMiIsCiAgICAibWFjIjogIlgvRU9hNldlbUMwPSIsCiAgICAiZW5jcnlwdGVkIjogIk1JSERCZ2xnaGtnQlpRTUVBZ01FZERCeUF3SUhBQUlCTkFJelJhdnhUcThSbVFibTVFdmlpN0xBaXhJS2N3YWtRS21xZ201Sk9xb2o1VlF5ZTBSZWh4dDMxakVqbE1BaE5pbTVnQzVRQWpRbG1mTk4vVEhzLzlWRDFkSm9FcEJCM1pkSlMzSm8zbzlTWjZVbEdBYzk5UDBuM0VzOFZQWldZVWtwRHdzeDFDaU1oTTBYQkVCVFVSMFNBYlhWb1ZHUTlLcitLZnArOEJJcDNyeVVNYmdoQW15WklGb2NZMnljMlVFUzZ5NEUwS3dMSnBIaWZjVytaVmpxTXg2MFlnUW9rc3lzSkwrcSIKfQo="
    }
  },
  "children": {
    "": {
      "prefix": "8BD6E7844D424750BEE00669C03C7612"
    }
  },
  "version": 3,
  "fileID": "7A7996CD16894347B413AF6BD777F367"
}

 

If you're read the encryption articles, this should look familiar by now. But here's the details:

fileID
  • Every node has a server-assigned fileID
  • This value is immutable, and is used by the sync system
  • In particular, it helps to facilitate move & rename operations (i.e. the sync system can detect if a file has been moved or renamed, because its fileID will match a node already in the system)
keys
  • Who has permission to read this node ?
  • Permissions are explained in the more detail here
  • The big key blob (ewogIC....) is: Base64(Wrap(fileKey, pubKey))
  • In other words, the content cannot be decrypted without the file encryption key
  • And the file encryption key cannot be decrypted without the matching private key
dirPrefix
  • As explained above
  • This value is immutable - server enforced
  • Server also protects against dirPrefix collisions
version
  • Just a version number for the format of the RCRD file itself
  • If we need to make changes to the JSON format, this value gets incremented
metadata
  • Contains the filename (e.g. "toystory")
  • Also contains the node's dirSalt, which will be used by all its children
  • This information is encrypted with the file encryption key
  • So the server cannot read it, because the server cannot decrypt it

Of interest to our audit is the dirPrefix specified by the node: 8BD6E7844D424750BEE00669C03C7612. This means that all children of this particular node will share the same prefix: com.4th-a.ZeroDarkTodo/8BD6E7844D424750BEE00669C03C7612. And this is how you can walk-the-tree.

Code Audit

When the client framework downloads a RCRD file from the cloud, it decrypts it in ZDCCryptoTools.m, in a method called parseCloudRcrdDict.

Summary

The ZeroDark architecture was designed from the ground up specifically to achieve zero-knowledge in the cloud. By design, the system gives the server the minimum amount of information necessary to do its job. In particular:

  • the server cannot read the content generated by your app
  • the server cannot read the names of nodes

In addition to this, the client-framework is open-source to ensure that all encryption routines are available for inspection & audit.

For even more detailed information: