SAS REST APIs: Filtering

Filtering is the application of a Boolean condition against a collection of resources in order to subset the collection to ony those resources for which the condition is true. (For those familiar with SQL, filtering is analogous to a SQL query with a WHERE clause.) SAS REST APIs support two forms of filtering: basic filtering and a filter query parameter. The two forms may be combined.

Consider, for example, a collection of report resources. Given the following members:

Member Type Description
id string The report's unique identifier.
name string The localizable report name.
description string The localizable report description.
createdBy string The name of the user who created this report.
creationTimeStamp date-time string The date and time the report wast first created.
modifiedBy string he name of the last user who modified this report.
modifiedTimeStamp date-time string The date and time the report was last modified.
imageUris object The map of images of the report object.
and this JSON representation of a report resource:
{
  "id": "4eb3b675-e107-4857-a8f4-51aa555ac7e7",
  "name": "Production Report",
  "description": "Production forecasts by region and market segment.",
  "createdBy": "dale",
  "creationTimeStamp": "2017-04-19T14:54:04.705Z",
  "modifiedBy": "dale",
  "modifiedTimeStamp": "2017-04-19T14:55:11.643Z",
  "links": [ ... ],
  "imageUris": {
    "icon": "/reports/icons/report.gif"
  },
  "version": 1
}
The filtering explanations below use this example scenario.

Basic filtering

Basic filtering allows selecting resources by matching one or more members of the resource to values passed as query parameters. For example, to find all reports created by the user dale, a basic filter is:

   GET /reports/reports?createdBy=dale

The names of members in the resource representation (such as description, name, createdBy, modifiedBy etc) are used as the name of the corresponding query parameters. The general form is

   GET /reports/reports?memberName0=value0&...&memberNamen=valuen

Basic filtering only supports exact matches of members or simple set containment. Note that quotes are not used around either the names or the string values when using basic filtering.

Some APIs support nested resources, so you can use dot notation to select a nested member, such as imageUris.icon to filter based on the icon member of the imageUris object in the above example resource.

When multiple basic query parameters are used, there is an implicit and operation. That is, all the basic filter values must match in order for a resource to be included in the response. For example, a request

   GET /reports/reports?createdBy=dale&name=Production%20Report
will match reports that were created by dale and that have the name Production Report. Note that the values must be URL encoded using standard query parameter encoding rules; for example, a space in the value string has an encoded value of %20.

Some members are not well-suited to exact matching, such as the modifiedTimeStamp which is a date-time value with sub-second precision: ?modifiedTimeStamp=2017-04-19T14:55:11.643Z. For such requirements, you should use the filter parameter described below to select resources based on ranges, comparisons, and other selection criteria.

Basic set containment filtering

When using basic filtering, you may pass a pipe-separated set of values. The basic filter matches if the named member matches any of the values in the set. As with simple query parameters, do not use quote characters around the values.

For example, to find reports created by users dale or elaine or jules, use the pipe-separated set containment notation to express the set of values, such as dale|elaine|jules. Of course, the | character must be URL-encoded as %7C when using set containment notation in a query parameter. That means:

    GET /reports/reports?createdBy=dale|elaine|jules
is URL encoded as:
    GET /reports/reports?createdBy=dale%7Celaine%7Cjules
The values must be URL encoded as well.

Filtering with the filter query parameter

The filter query parameter provides a flexible way to subset the resources from collections by combining comparison and other functions.

   ?filter=filterExpression
Unlike basic filtering, the filterExpression enables you to:

The simple (and somewhat useless) filter ?filter=true evaluates to true for all items in the collection — this is the default behavior if no filter is provided.

tree representation of the filter expression <code>and(or(e,not(b)),gt(C,0))</code>

Figure 1: Tree representation of filter expression and(or(e,not(b)),gt(C,0))

The filter query parameter names a filter expression which uses a very simple expression notation. The filter expression syntax expresses the same condition as a combination of function calls:

 and(or(e,not(b)),gt(C,0))
Parentheses surround the comma-separated arguments to each function and are required to differentiate a function from a member name. Figure 1 shows the structure of the expression; ovals represent functions or operations and rectangle represent members or literal values.

Filter expression syntax

The syntax for filter expressions:

expression := literal
expression := name
expression := function
function := identifier '(' [ arguments ] ')'
arguments := expression [ ',' arguments ]
literal := 'true' | 'false'
literal := number
literal := date
literal := time
literal := date-time
literal := string
string := "'" characters "'"
string := '"' characters '"'
name := identifier [ '.' name ]

For example, the basic query used above,

   GET /reports/reports?createdBy=dale&name=Production%20Report
can be expressed by combining two equality test functions on the members createdBy and name, combined with the and function:
   GET /reports/reports?filter=and(eq(createdBy,'dale'),eq(name,'Production%20Report'))

Filter expressions

There are three types of expressions that may be used in the filterExpression a filter query:

  1. literals
  2. names
  3. functions

Filter literals

Literals (also known as constants) are fixed values, such as

Literals are typically used as arguments to functions.

String literals are sequences of zero or more characters enclosed in quotes. In filter query parameters, strings may be enclosed in either single or double quotes. The string literal 'dale' is the same as the string literal "dale"; only the contents of the string is significant. If the string contains quote characters, use one of two representations:

  1. To include single quotes (') in a string literal, use double quotes to quote the string: "Dale Smith's ID is 'dale'".
    To include double quotes (") in a string literal, use single quotes to quote the string: 'Dale chose the ID "dale".'.
  2. Alternatively, you can uniformly double each embedded quote character:
       'IT assigned the user ID ''dale'' to Dale Smith.'
       "IT assigned the user ID 'dale' to Dale Smith."
    
    are identical and have the value
       IT assigned the user ID 'dale' to Dale Smith.
    

Names

Names in filter expressions refer to members of the items in the collection, such as the members name, createdBy, modifiedBy, or imageUris.icon from the example report resource above. Names are typically used as arguments to functions, such as comparing the members to literal values, as described below. An example is the expression

   eq(createdBy,'dale')
which is true for any item in the collection that has the exact value 'dale' as its createdBy member.

Date, time, and date-time literals are other name filters.

  1. A date literal uses the form yyyy-MM-dd such as 2017-07-27.
  2. A time literal uses the form HH:mm:ss and HH:mm:ss.SSSZ where the hours HH range from 00 to 24 and the fractional seconds and time zone are optional. The time zone code Z means UTC (GMT) time zone. Time zone offsets if +HH:mm and -HH:mm are also allowed instead of Z.
  3. A date-time literal uses the form yyyy-MM-ddTHH:mm:ss.SSSZ where the fractional seconds and time zone are optional; for example, 2017-06-28T03:18:53.0717Z
Note: Date, time, and date-time values are not quoted as strings.

Filter function expressions

Functions operate on one or more values which form the arguments to the function. A function expression has the form

function := identifier '(' [ arguments ] ')'
arguments := expression [ ',' arguments ]
such as eq(name,'dale'). Here, the function name is eq and the two function arguments (a name and a string literal) are encoded in parentheses and separated by commas.

Below is the base set of functions that may be used in filter expressions.

Logical functions

Logical functions combine other boolean expressions.

Relational functions

Relational functions compare two or more expressions. Most functions accept two or more arguments, in which case the relation must hold for consecutive values; i.e. lt(a,b,c,d) is equivalent to and(lt(a,b),lt(b,c),lt(c,d)) (that is, a < b && b < c && c < d.

This syntax cannot express combinations of relational operators, such as minInclusive ≤ x < maxExclusive; one must use and(le(minInclusive,x),lt(x,maxExclusive)).

Relational functions may be combined with date, time, and date-time literals. To find all reports created between January and March 2017 (using GMT)

   GET /reports/reports?filter=le(2017-01-01T00:00:00Z,creationTimeStamp,2017-03-31T24:00:00Z)
String functions

Functions which manipulate string values

Locale-sensitive collation

Several of the above functions require locale-sensitive collation, where the locale is derived by the Accept-Language header passed by the client.

Locale-sensitive collation strength is passed to these functions using one of several reserved identifiers, as optional first arguments. The identifier name is obtained by prefixing the collation options below with a dollar sign ('$'):

If the first parameter is not one of these special identifiers, the default is $identical. These reserved identifiers are ignored when comparing numeric and boolean values.

Examples of locale-sensitive functions, using $tertiary as an example:

Combining basic filtering and the filter query parameter

An API request may combine basic filtering and the filter query parameter. When both are used, the request combines them by adding an implicit and function to the existing filter query parameter. For example, for

    GET .../resource?name=dale&format=ruled&filter=or(bounded,lt(shipDate,2017-07-27))
The basic filters ( name=dale and format=ruled ) will be merged into the filter expression, or(bounded,lt(shipDate,2017-07-27)), yielding the effective filter query
    GET .../resource?filter=and(eq(name,'dale'),eq(format,'ruled'),or(bounded,lt(shipDate,2017-07-27)))

Note that the resulting filter expression is always false if the basic filter and the filter specify mutually exclusive values:

    GET .../resource?name=dale&format=ruled&filter=eq(name,'jack')
The effective filter query:
    GET .../resource?filter=and(eq(name,'dale'),eq(format,'ruled'),eq(name,'robert')
This is always false (resulting in an empty collection result) because name cannot be 'dale' and 'robert' at the same time.