Skip to main content

Data types

Discover the data types supported by Vault

Vault provides standard, personal information, financial information, and identifier semantic data types. You can also add custom data types (beta).

Standard types

NameDescriptionValidationNormalizationNotes
STRINGA string.Must not contain more characters than the value of the PVAULT_DB_MAX_STRING_LENGTH environment variable.Vault applies Unicode Normalization Form Composed (NFC) that, whenever possible, replaces sequences of code points with a single point.When the PVAULT_DB_MAX_STRING_LENGTH environment variable is set to more than 2048, properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database.
LONG_TEXTA string.Vault applies no validation.See STRING normalization.Properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database.
JSONA JSON objectMust comply with JSON RFC7159.Vault guarantees that the normalized value is equivalent to the original value as a JSON object, but not necessarily as a string.Properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database.
INTEGERA 64-bit signed integer.Must be larger or equal to −2^63 and smaller or equal to 2^63 − 1.Vault applies no normalization.
BOOLEANA boolean value.Must be true or false (lower case characters).Vault applies no normalization.
DOUBLEA 64-bit decimal floating-point number.Must be a valid JSON number.Vault may truncate values that cannot be represented accurately by this type.
BLOBA base64 string. Can be used to store files (e.g. your customer's KYC files)Must be a valid base64 string. By default, the binary blob (before encoded to base64) should be up to 5MB. The PVAULT_DB_MAX_BLOB_LENGTH environment variable controls this limit.Properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database.

Personal information types

NameDescriptionValidationNormalization
EMAILA string containing an email address.Must comply with emailRegexString.Applies RFC 2047 formatting and then deletes the "<" prefix and ">" suffix.
EMAIL_STRICTA string containing an email address or its alias.See EMAIL validationSee Strict Email normalization.
URLA string containing an absolute URL (e.g., https://www.piiano.com/) or a relative URL (e.g., piiano.com).Must comply with the BNF for specific URL schemes specification. Must contain no more than 2048 characters.See URL normalization.
PHONE_NUMBERA string of up to 15 digits with an optional leading '+'. Hyphens '-' may separate groups of digits in the string. For example, +1-123-4567890.Must comply with E.164 and include the international country code.Vault removes all hyphens and adds a + prefix.
ZIP_CODE_USA string containing 5 or 9 digits. For example, 10004, 71109–1500, or 71109 1500.Must comply with ISO 3166-2:US, matching the regular expression ^\d{5}([ \-]\d{4})?$.Vault applies no normalization.
TIMESTAMPA string containing a date and time.Must be in one of formats specified in Supported timestamp formats.RFC 3339 with the ISO 8601 profile (using "T" separator and terminating with "Z" to indicate the GMT zone).
DATEA string containing a date.Must comply with ISO-8601 format, "YYYY-MM-DD".Vault applies no normalization.
DATE_OF_BIRTHA string containing a date.See DATE validation.See DATE normalization.
NAMEA string containing a name.See STRING validation.See STRING normalization.
GENDERA case-insensitive string containing a gender.See STRING validation.Converted to lower case, then follows STRING normalization.
SSNA string containing a US Social Security Number.Must include exactly 9 digits or 9 digits in three groups separated by spaces or hypens. For example, 444-21-4300 or 444 21 4300 or 444214300.Normalizes to a 9-digit string in this format: ddd-dd-dddd where d is a single digit.
ADDRESSA string containing an address.Must contain no more than 1024 characters.See LONG_TEXT normalization.

Financial information types

NameDescriptionValidationNormalization
BANA bank account number.Must contain 5 – 17 digits.Vault applies no normalization.
CC_HOLDER_NAMEA credit card holder name.See STRING validation.See STRING normalization.
CC_NUMBERA credit card number with optional separators (hyphens and spaces).Must contain from 12 to 19 digits. Separators cannot appear consecutively. The number must pass the Luhn checksum validation.Vault removes all spaces and hyphens.
CC_EXPIRATION_STRINGA credit card expiration month and year.Must match the MM/YYYY or MM/YY format.Vault applies no normalization.
CC_CVVA credit card verification value.Must contain 3 or 4 digits only.Vault applies no normalization.
US_BANK_ROUTINGA US bank routing number. See Routing Number and How to Arrange a Routing Number.A string complying with the regular expression ^(([0-9]{9})|([0-9]{4}/[0-9]{4})|(([0-9]{2})-([0-9]{4})/([0-9]{4})))$.Vault applies no normalization.
US_BANK_ACCOUNT_NUMBERA US bank account number.Must include at least one character from the a-z, A-Z, 0-9 character set. See also STRING validation.Vault applies no normalization.

Identifier types

NameDescriptionValidationNormalization
OBJECT_IDA unique ID complying with RFC 4122 Section 3Must comply with RFC 4122 Section 3 and may contain mixed case characters.Convert all characters to lower case.
TENANT_IDAn ID of a tenant.See STRING validation.See STRING normalization.

Reference

Indexing

Data types can be indexed for string or substring searches or both.

  • String search indexing can be set for all data types except LONG_TEXT, JSON, or BLOB, or custom data types based on those types.
  • Substring search indexing can be set for the data types STRING, LONG_TEXT EMAIL, EMAIL_STRICT, URL, NAME, GENDER, BAN, SSN, ADDRESS, CC_HOLDER_NAME, US_BANK_ROUTING, US_BANK_ACCOUNT_NUMBER, or custom data types based on STRING.

Strict email normalization

Email addresses are composed of user and domain names separated by the "@" character. For the STRICT_EMAIL type Vault supports domain-specific email address normalization with independent normalization of user and domain names. The following normalizations are implemented for the specified email providers:

ProviderProvider's identificationUser name normalizationDomain name normalizationExample
Googlegooglemail.com, gmail.comConvert all characters to lower case, remove all "."s, and remove all characters following the left-most "+", including "+".@gmail.com.John.Do.E+friends@googlemail.com -> johndoe@gmail.com
Appleicloud.comConvert all characters to lower case, and remove a subaddress (or plus address), if present.None.John.Doe+shopping@icloud.com -> john.doeshopping@icloud.com
Microsofthotmail.***, live.***, outlook.***Convert all characters to lower case, and remove a subaddress (or plus address), if present.None.John.Doe+shopping@outlook.it -> john.doe@outlook.it
Yahooyahoo.***, ymail.comConvert all characters to lower case, remove all "."s, and remove all "-"s.None.J.D-one-time-address@yahoo.com -> jdonetimeaddress@yahoo.com

URL normalization

Vault applies all following normalizations to values of type URL:

NormalizationExample
Converts the host name to lower casehttp://HOST -> http://host
Converts the scheme to lower caseHTTP://host -> http://host
Converts escape characters to upper casehttp://host/t%ef -> http://host/t%EF
Decodes unnecessary escapeshttp://host/t%41 -> http://host/tA
Encodes necessary escapeshttp://host/!"#$ -> http://host/%21%22#$
Removes the default porthttp://host:80 -> http://host
Removes an empty query separatorhttp://host/path? -> http://host/path
Removes a trailing slashhttp://host/path/ -> http://host/path
Removes dot segmentshttp://host/path/./a/b/../c -> http://host/path/a/c
Removes duplicate slasheshttp://host/path//a///b -> http://host/path/a/b
note

The URL parameters of the URL are not changed.

Supported timestamp formats

A timestamp (date and time) must be provided in one of the following formats:

  • UnixDate, Mon Jan 2 15:04:05 MST 2006 or Mon Jan 02 15:04:05 MST 2006
  • ANSIC, Mon Jan 2 15:04:05 2006 or Mon Jan 02 15:04:05 2006
  • RFC850, Monday, 02-Jan-06 15:04:05 MST
  • RFC1123, Mon, 02 Jan 2006 15:04:05 MST
  • RFC1123 with numeric zone, Mon, 02 Jan 2006 15:04:05 -0700
  • RFC2822, 02 Jan 06 15:04 MST
  • RFC2822 with numeric zone, 02 Jan 06 15:04 -0700

Vault also supports the following timestamp formats, which use the "T" prefix for time and the "Z" prefix for zone:

  • ISO-8601 2006-01-02T15:04:05Z
  • RFC3339 with the "T" separator, 2006-01-02T15:04:05+07:00 or 2006-01-02T15:04:05Z
  • RFC3339Nano with the "T" separator, with microseconds (μs) precision 2006-01-02T15:04:05.999999-07:00 or 2006-01-02T15:04:05.999999Z
note

A terminating "Z" indicates Greenwich Mean Time (GMT). Four optional digits instead of the "Z" indicate a time offset relative to GMT.