Data types
Discover the data types supported by Vault
Vault provides standard, personal information, financial information, and identifier semantic data types. You can also add custom data types.
Standard types
Name | Description | Validation | Normalization | Notes |
---|---|---|---|---|
STRING | A string. | Must not contain more characters than the value of the PVAULT_DB_MAX_STRING_LENGTH environment variable. | Vault applies Unicode Normalization Form Composed (NFC) that, whenever possible, replaces sequences of code points with a single point. | When the PVAULT_DB_MAX_STRING_LENGTH environment variable is set to more than 2048, properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database. |
LONG_TEXT | A string. | Must not contain more characters than the value of the PVAULT_DB_MAX_BLOB_LENGTH environment variable. | See STRING normalization. | Properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database. |
JSON | A JSON object | Must comply with JSON RFC7159. Must not contain more characters than the value of the PVAULT_DB_MAX_BLOB_LENGTH environment variable. | Vault guarantees that the normalized value is equivalent to the original value as a JSON object, but not necessarily as a string. | Properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database. |
INTEGER | A 64-bit signed integer. | Must be larger or equal to −2^63 and smaller or equal to 2^63 − 1 . | Vault applies no normalization. | |
BOOLEAN | A boolean value. | Must be true or false (lower case characters). | Vault applies no normalization. | |
DOUBLE | A 64-bit decimal floating-point number. | Must be a valid JSON number. | Vault may truncate values that cannot be represented accurately by this type. | |
BLOB | A base64 string. Can be used to store files (e.g. your customer's KYC files) | Must be a valid base64 string. By default, the binary blob (before encoded to base64) should be up to 5MB. The PVAULT_DB_MAX_BLOB_LENGTH environment variable controls this limit. | Properties for this data type, or for custom data types based on this data type, cannot be set to be unique or indexed in the database. |
Personal information types
Name | Description | Validation | Normalization |
---|---|---|---|
A string containing an email address. | Must comply with emailRegexString . | Applies RFC 2047 formatting and then deletes the "<" prefix and ">" suffix. | |
EMAIL_STRICT | A string containing an email address or its alias. | See EMAIL validation | See Strict Email normalization. |
URL | A string containing an absolute URL (e.g., https://www.piiano.com/) or a relative URL (e.g., piiano.com ). | Must comply with the BNF for specific URL schemes specification. Must contain no more than 2048 characters. | See URL normalization. |
PHONE_NUMBER | A string of up to 15 digits with an optional leading '+'. Hyphens '-' may separate groups of digits in the string. For example, +1-123-4567890 . | Must comply with E.164 and include the international country code. | Vault removes all hyphens and adds a + prefix. |
ZIP_CODE_US | A string containing 5 or 9 digits. For example, 10004 , 71109–1500 , or 71109 1500 . | Must comply with ISO 3166-2:US, matching the regular expression ^\d{5}([ \-]\d{4})?$ . | Vault applies no normalization. |
TIMESTAMP | A string containing a date and time. | Must be in one of formats specified in Supported timestamp formats. | RFC 3339 with the ISO 8601 profile (using "T" separator and terminating with "Z" to indicate the GMT zone). |
DATE | A string containing a date. | Must comply with ISO-8601 format, "YYYY-MM-DD". | Vault applies no normalization. |
DATE_OF_BIRTH | A string containing a date. | See DATE validation. | See DATE normalization. |
NAME | A string containing a name. | See STRING validation. | See STRING normalization. |
GENDER | A case-insensitive string containing a gender. | See STRING validation. | Converted to lower case, then follows STRING normalization. |
SSN | A string containing a US Social Security Number. | Must include exactly 9 digits or 9 digits in three groups separated by spaces or hypens. For example, 444-21-4300 or 444 21 4300 or 444214300 . | Normalizes to a 9-digit string in this format: ddd-dd-dddd where d is a single digit. |
ADDRESS | A string containing an address. | Must contain no more than 1024 characters. | See LONG_TEXT normalization. |
Financial information types
Name | Description | Validation | Normalization |
---|---|---|---|
BAN | A bank account number. | Must contain 5 – 17 digits. | Vault applies no normalization. |
CC_HOLDER_NAME | A credit card holder name. | See STRING validation. | See STRING normalization. |
CC_NUMBER | A credit card number with optional separators (hyphens and spaces). | Must contain from 12 to 19 digits. Separators cannot appear consecutively. The number must pass the Luhn checksum validation. | Vault removes all spaces and hyphens. |
CC_EXPIRATION_STRING | A credit card expiration month and year. | Must match the MM/YYYY or MM/YY format. | Vault applies no normalization. |
CC_CVV | A credit card verification value. | Must contain 3 or 4 digits only. | Vault applies no normalization. |
US_BANK_ROUTING | A US bank routing number. See Routing Number and How to Arrange a Routing Number. | A string complying with the regular expression ^(([0-9]{9})|([0-9]{4}/[0-9]{4})|(([0-9]{2})-([0-9]{4})/([0-9]{4})))$ . | Vault applies no normalization. |
US_BANK_ACCOUNT_NUMBER | A US bank account number. | Must include at least one character from the a-z, A-Z, 0-9 character set. See also STRING validation. | Vault applies no normalization. |
Identifier types
Name | Description | Validation | Normalization |
---|---|---|---|
OBJECT_ID | A unique ID complying with RFC 4122 Section 3 | Must comply with RFC 4122 Section 3 and may contain mixed case characters. | Convert all characters to lower case. |
TENANT_ID | An ID of a tenant. | See STRING validation. | See STRING normalization. |
Reference
Indexing
Data types can be indexed for string or substring searches or both.
- String search indexing can be set for all data types except
LONG_TEXT
,JSON
, orBLOB
, or custom data types based on those types. - Substring search indexing can be set for the data types
STRING
,LONG_TEXT
EMAIL
,EMAIL_STRICT
,URL
,NAME
,GENDER
,BAN
,SSN
,ADDRESS
,CC_HOLDER_NAME
,US_BANK_ROUTING
,US_BANK_ACCOUNT_NUMBER
, or custom data types based onSTRING
.
Strict email normalization
Email addresses are composed of user and domain names separated by the "@" character. For the STRICT_EMAIL type Vault supports domain-specific email address normalization with independent normalization of user and domain names. The following normalizations are implemented for the specified email providers:
Provider | Provider's identification | User name normalization | Domain name normalization | Example |
---|---|---|---|---|
googlemail.com , gmail.com | Convert all characters to lower case, remove all "."s, and remove all characters following the left-most "+", including "+". | @gmail.com . | John.Do.E+friends@googlemail.com -> johndoe@gmail.com | |
Apple | icloud.com | Convert all characters to lower case, and remove a subaddress (or plus address), if present. | None. | John.Doe+shopping@icloud.com -> john.doeshopping@icloud.com |
Microsoft | hotmail.*** , live.*** , outlook.*** | Convert all characters to lower case, and remove a subaddress (or plus address), if present. | None. | John.Doe+shopping@outlook.it -> john.doe@outlook.it |
Yahoo | yahoo.*** , ymail.com | Convert all characters to lower case, remove all "."s, and remove all "-"s. | None. | J.D-one-time-address@yahoo.com -> jdonetimeaddress@yahoo.com |
URL normalization
Vault applies all following normalizations to values of type URL:
Normalization | Example |
---|---|
Converts the host name to lower case | http://HOST -> http://host |
Converts the scheme to lower case | HTTP://host -> http://host |
Converts escape characters to upper case | http://host/t%ef -> http://host/t%EF |
Decodes unnecessary escapes | http://host/t%41 -> http://host/tA |
Encodes necessary escapes | http://host/!"#$ -> http://host/%21%22#$ |
Removes the default port | http://host:80 -> http://host |
Removes an empty query separator | http://host/path? -> http://host/path |
Removes a trailing slash | http://host/path/ -> http://host/path |
Removes dot segments | http://host/path/./a/b/../c -> http://host/path/a/c |
Removes duplicate slashes | http://host/path//a///b -> http://host/path/a/b |
The URL parameters of the URL are not changed.
Supported timestamp formats
A timestamp (date and time) must be provided in one of the following formats:
- UnixDate,
Mon Jan 2 15:04:05 MST 2006
orMon Jan 02 15:04:05 MST 2006
- ANSIC,
Mon Jan 2 15:04:05 2006
orMon Jan 02 15:04:05 2006
- RFC850,
Monday, 02-Jan-06 15:04:05 MST
- RFC1123,
Mon, 02 Jan 2006 15:04:05 MST
- RFC1123 with numeric zone,
Mon, 02 Jan 2006 15:04:05 -0700
- RFC2822,
02 Jan 06 15:04 MST
- RFC2822 with numeric zone,
02 Jan 06 15:04 -0700
Vault also supports the following timestamp formats, which use the "T" prefix for time and the "Z" prefix for zone:
- ISO-8601
2006-01-02T15:04:05Z
- RFC3339 with the "T" separator,
2006-01-02T15:04:05+07:00
or2006-01-02T15:04:05Z
- RFC3339Nano with the "T" separator, with microseconds (μs) precision
2006-01-02T15:04:05.999999-07:00
or2006-01-02T15:04:05.999999Z
A terminating "Z" indicates Greenwich Mean Time (GMT). Four optional digits instead of the "Z" indicate a time offset relative to GMT.