Work with files and blobs
Learn how to secure file and unstructured data with Piiano Vault
Vault provides two mechanisms for protecting the content of files and unstructured data:
- Store the data in a collection using an attribute with the
BLOB
data type. - Use the cryptographic API to encrypt the data and store the resulting ciphertext in another system.
Vault doesn't process files directly, only their contents, even when using the Vault SDK. You must read a file's content and pass it to Vault to store or encrypt. When you read or decrypt content, you must save it in the appropriate file format.
Work with blobs in objects
This option is ideal for a small number of data items per user. For example, an identity photograph, avatar, or biometric data.
Specify a blob in a collection schema
You specify a blob as part of a collection in the same way as any other collection attribute. For example, for personal data, including an ID photo, you might specify a collection using a PVSchema like this:
employees PERSONS (
full_name NAME NULL,
id_photo BLOB
);
Or in JSON format:
{
"type": "PERSONS",
"name": "employees",
"properties": [
{
"name": "full_name",
"data_type_name": "NAME",
"is_nullable": true
},
{
"name": "id_photo",
"data_type_name": "BLOB"
}
]
}
Save and retrieve blob data
You can send blobs to and retrieve them from Vault:
- Using JSON messages by encoding the file's contents in base64. With this method, you can pass multiple blobs in one call.
- Using raw file contents by specifying the request's content type or accept header as
application/octet-stream
. With this method, you can pass one blob per call.
The raw file contents option is more performant as it requires the transmission of fewer bytes and avoids the conversion to and from base64.
Using JSON messages
To store the employee's ID photo, you encode the image data in base64 and then write it to the collection. For example, using the REST API Add object operation, like this:
curl -s -X POST \
--url 'http://localhost:8123/api/pvlt/1.0/data/collections/employees/objects' \
-H 'Authorization: Bearer pvaultauth' \
-H 'Content-Type: application/json' \
-d '{
"full_name":"Jane Doe"
"id_photo":"JVBERi0xLjMNJeLjz9MNCjcgMCBvYmoNPDwvTGluZWFyaXplZCAxL0wgNzk0NS9PIDkvRSAzNT…VFT0YNCg==",
}'
You can then retrieve the blob as you would any other data from a collection and retrieve the image by decoding the base64 string.
Using raw file contents
To store the employee's ID photo, you write it to the collection specifying the property in the query string. For example, using the REST API Add object operation, like this:
curl -s -X POST \
--url 'http://localhost:8123/api/pvlt/1.0/data/collections/employees/objects?prop=id_photo' \
-H 'Authorization: Bearer pvaultauth' \
-H 'Content-Type: application/octet-stream' \
--data-binary @path-to-file.bin
You can then retrieve the blob in base64, as you would any other data from a collection. Alternatively, you can get your blob in raw form like this:
OBJ_ID=32077c80-3792-4a45-a957-e365bb1c9533
curl -s -X GET \
--url "http://localhost:8123/api/pvlt/1.0/data/collections/employees/objects/${OBJ_ID}?props=id_photo" \
-H "Authorization: Bearer pvaultauth" \
-H "Accept: application/octet-stream" \
-o path-to-file.bin
Or using the CLI like this:
pvault object get-blob --id ${OBJ_ID} --prop id_photo -c employees -o path-to-file.bin
Limitations
By default, Vault limits to 5MB. The PVAULT_DB_MAX_BLOB_LENGTH
environment variable controls this limit.
Use encryption to work with blobs
This option is best for cases where there are large numbers of files or larger-size files, such as medical images. In this case, you can store the encrypted copies of the blob in your storage solution (e.g., database, S3, etc.) and use Vault as an encryption and decryption service.
Specify collection schema
The APIs to encrypt and decrypt blobs are based on those that offer that feature to structured object data, so they require a collection to work against. A basic collection only needs to include the blob attribute like this using a PVSchema:
files DATA (
file BLOB
);
Or in JSON format:
{
"type": "DATA",
"name": "files",
"properties": [
{
"name": "file",
"data_type_name": "BLOB"
}
]
}
Encrypt your file
Now you pass the binary content of the file to the REST API encrypt blob operation like this:
curl --request POST \
--url 'http://localhost:8123/api/pvlt/1.0/data/collections/files/encrypt/blob?reason=Maintenance&prop=file&type=randomized&scope=default&tags=tag1%2Ctag2' \
--header 'Authorization: Bearer pvaultauth' \
--header 'Content-Type: application/octet-stream' \
--data string
The operation returns cyphertext, which you store in a database optimized for storing large objects or a file server.
Decrypt your blob
To retrieve the content of the file, you pass the cyphertext to the REST API decrypt blob operation like this:
curl --request POST \
--url 'http://localhost:8123/api/pvlt/1.0/data/collections/files/decrypt/blob?reason=Maintenance&prop=file&scope=default' \
--header 'Authorization: Bearer pvaultauth' \
--header 'Content-Type: application/octet-stream' \
--data string