Substring search on encrypted properties
Learn how to search for objects in Vault using glob pattern matches on encrypted properties
To search for objects in a collection using substring search, you use the CLI search objects or the REST API search objects operation passing the property or properties you want to get and the collection name. The query operator for substring search is like
.
You can perform a substring search on encrypted properties only.
In CLI, you can search for objects with a substring search like this:
pvault object query -c users --like name="*john*" --props name
or in API, like this:
curl -X POST \
-H 'Authorization: Bearer pvaultauth' \
-H 'Content-Type: application/json' \
-d '{"like":{"name": "*john*"}}' \
"http://localhost:8123/api/pvlt/1.0/data/collections/users/query/objects?props=name"
Vault refreshes the substring index every few seconds (See the PVAULT_SERVICE_SUBSTRING_INDEX_REFRESH_INTERVAL
environment variable). This means that if you add or update an object, you may not be able to search for it using substring immediately.
Search patterns
Searching objects using substring is done by specifying a search pattern. The supported search pattern is a subset of glob patterns. The supported characters are:
*
– matches zero or more characters.?
– matches exactly one character.
For example:
*john*
– matches any value that contains the substring "john".john*
– matches any value that starts with "john".*john
– matches any value that ends with "john".j?hn
– matches any value that has "j" as the first character, "h" as the third character, and "n" as the fourth character.- 'john' - reverts to exact match semantics; doesn't perform a substring match.
Searches are case-insensitive, so *john*
matches any value that contains the strings "John", "john", "JOHN", etc.
A search pattern must contain at least two 3-letter words (2 trigrams). For example *jo*
is too short for a search pattern, but *john*
(the 3-letters combinations: joh
and ohn
) and *jon*snow*
are valid.
The backslash (\
) is used as the escape character in search patterns. To search for the literal characters *
and ?
, prefix them with a backslash, like this: \*
and \?
. You can also escape the backslash itself by using two backslashes \\
.
Step-by-step
Say you have a collection called ‘customers’ that you created like this:
pvault collection add --collection-pvschema "customers PERSONS (
name NAME SUBSTRING_INDEX,
email EMAIL,
city STRING,
)"
Now, you want to retrieve email
for all the customers in this collection where:
- Their first name is "John". You do this using the "like" requirement in the search request.
- Their first name is "John", and they live in "New York". You do this using the "like" requirement with a "match" requirement in the search request.
Search objects using the "like" requirement
To get the email of a customer whose first name is "John", you first need to determine the search pattern to use. In this case, the search pattern is John *
, so the search query is:
name="John *"
You can search for objects with a "like" requirement using the CLI like this:
pvault object query -c customers --like name="John*" --props name,email,city
You get a response similar to:
Displaying 2 results.
+------+------------------------------+------------+
| city | email | name |
+------+------------------------------+------------+
| NY | johndoe@somemail.com | John Doe |
| SF | johnlemon@yetanothermain.com | John Lemon |
+------+------------------------------+------------+
or in API, like this:
curl -s -X POST \
-H 'Authorization: Bearer pvaultauth' \
-H 'Content-Type: application/json' \
-d '{"like":{"name": "John*"}}' \
"http://localhost:8123/api/pvlt/1.0/data/collections/customers/query/objects?props=name,email,city"
You get a response similar to:
{
"results": [
{
"email": "johndoe@somemail.com",
"name": "John Doe",
"city": "NY"
},
{
"email": "johnlemon@yetanothermain.com",
"name": "John Lemon",
"city": "SF"
}
],
"paging": {
"cursor": "",
"size": 2,
"remaining_count": 0
}
}
Search objects with the "like" and "match" or "in" requirements
You can add the "match" or "in" requirements to the "like" requirement. You get the email of customers whose first name is "John" and live in "New York" like this:
pvault object query -c customers --like name="John*" --match city="NY" --props name,email,city
or in API, like this:
curl -s -X POST \
-H 'Authorization: Bearer pvaultauth' \
-H 'Content-Type: application/json' \
-d '{"like":{"name": "John*"},"match":{"city":"NY"}}' \
"http://localhost:8123/api/pvlt/1.0/data/collections/customers/query/objects?props=name,email,city"
You get a response similar to:
{
"results": [
{
"city": "NY",
"email": "johndoe@somemail.com",
"name": "John Doe"
}
],
"paging": {
"cursor": "",
"size": 1,
"remaining_count": 0
}
}
The response is paginated. See CLI pagination for more information about working with paginated responses.