cookbook • porcelain

In contrast with the descriptive approach in the main vignette (vignette("porcelain")), this vignette contains little recipes for exposing and testing different endpoint types. They are ordered roughly from simplest to most complicated, and are written as standalone examples (which makes them quite repetitive!)

Below, we do not use the package convention of wrapping each endpoint in a function. This is to make the examples a little shorter and to make the endpoints more directly callable. In a package, a wrapper function is needed to make the schema path point to the correct place, and to allow binding of state into the endpoint (see later examples).

The one piece of shared code is that we will use a common schema root

schema_root <- system.file("examples/schema", package = "porcelain")

GET endpoint, inputs as query parameters, returning JSON

This is the example from the main vignette, adding two numbers given as query parameters and returning a single number. Note that we need to use jsonlite::unbox() to indicate that the single number should be returned as a number and not a vector of length 1 (compare jsonlite::toJSON(1) and jsonlite::toJSON(jsonlite::unbox(1)))

add <- function(a, b) {
  jsonlite::unbox(a + b)
}

endpoint_add <- porcelain::porcelain_endpoint$new(
  "GET", "/", add,
  porcelain::porcelain_input_query(a = "numeric", b = "numeric"),
  returning = porcelain::porcelain_returning_json("numeric", schema_root))

api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_add)

Run the endpoint:

api$request("GET", "/", query = list(a = 1, b = 2))
#> $status
#> [1] 200
#> 
#> $headers
#> $headers$`x-request-id`
#> [1] "4af2581c-1ebd-4613-830b-9cca1f5bd368"
#> 
#> $headers$`Content-Type`
#> [1] "application/json"
#> 
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#> 
#> 
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":3}"

GET endpoint, inputs as path and query parameters, returning JSON

Slightly more interesting return type, this time returning a numeric vector.

random <- function(distribution, n) {
  switch(distribution,
         normal = rnorm(n),
         uniform = runif(n),
         exponential = rexp(n))
}

endpoint_random <- porcelain::porcelain_endpoint$new(
  "GET", "/random/<distribution>", random,
  porcelain::porcelain_input_query(n = "numeric"),
  returning = porcelain::porcelain_returning_json("numericVector", schema_root))

api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_random)

Run the endpoint:

api$request("GET", "/random/normal", query = list(n = 4))
#> $status
#> [1] 200
#> 
#> $headers
#> $headers$`x-request-id`
#> [1] "42f31bee-abc4-4df9-b942-af842ad33c80"
#> 
#> $headers$`Content-Type`
#> [1] "application/json"
#> 
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#> 
#> 
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[-1.4,0.2553,-2.4373,-0.0056]}"
api$request("GET", "/random/uniform", query = list(n = 4))
#> $status
#> [1] 200
#> 
#> $headers
#> $headers$`x-request-id`
#> [1] "3c054c04-1069-4417-99ac-80ad23d2ae6b"
#> 
#> $headers$`Content-Type`
#> [1] "application/json"
#> 
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#> 
#> 
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[0.7329,0.7725,0.8746,0.1749]}"

Note that the output here is always a vector, even in the corner cases of 1 and 0 elements returned:

api$request("GET", "/random/normal", query = list(n = 1))
#> $status
#> [1] 200
#> 
#> $headers
#> $headers$`x-request-id`
#> [1] "1710d4c7-8320-436b-9891-56888a82a970"
#> 
#> $headers$`Content-Type`
#> [1] "application/json"
#> 
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#> 
#> 
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[-1.8218]}"
api$request("GET", "/random/normal", query = list(n = 0))
#> $status
#> [1] 200
#> 
#> $headers
#> $headers$`x-request-id`
#> [1] "015b6de0-a22a-4480-b58b-07e93b6abd27"
#> 
#> $headers$`Content-Type`
#> [1] "application/json"
#> 
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#> 
#> 
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[]}"

POST endpoint, inputs as JSON, returning JSON

Here is one way that a complex statistical procedure (here, just lm) might be wrapped as an API endpoint. We’ll run a linear regression against vectors of data x and y and return a table of coefficients.

x <- runif(10)
data <- data.frame(x = x, y = x * 2 + rnorm(length(x), sd = 0.3))
fit <- lm(y ~ x, data)
summary(fit)
#> 
#> Call:
#> lm(formula = y ~ x, data = data)
#> 
#> Residuals:
#>      Min       1Q   Median       3Q      Max 
#> -0.41326 -0.27802 -0.01725  0.20852  0.65994 
#> 
#> Coefficients:
#>             Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)  -0.1654     0.2163  -0.765 0.466436    
#> x             2.3107     0.4248   5.439 0.000617 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 0.3603 on 8 degrees of freedom
#> Multiple R-squared:  0.7871, Adjusted R-squared:  0.7605 
#> F-statistic: 29.58 on 1 and 8 DF,  p-value: 0.0006167

We’re interested in getting the table of coefficients, which we can extract like this:

summary(fit)$coefficients
#>               Estimate Std. Error    t value     Pr(>|t|)
#> (Intercept) -0.1654202  0.2163355 -0.7646466 0.4664360449
#> x            2.3106636  0.4248277  5.4390610 0.0006166622

and transform a little to turn the row names into a column of their own

lm_coef <- as.data.frame(summary(fit)$coefficients)
lm_coef <- cbind(name = rownames(lm_coef), lm_coef)
rownames(lm_coef) <- NULL

(the broom package provides a nice way of doing this sort of manipulation of these slightly opaque objects). There are many ways of serialising this sort of data; we will do it in the default way supported by jsonlite, representing the object as an array of objects, each of which is key/value pairs for each row:

jsonlite::toJSON(lm_coef, pretty = TRUE)
#> [
#>   {
#>     "name": "(Intercept)",
#>     "Estimate": -0.1654,
#>     "Std. Error": 0.2163,
#>     "t value": -0.7646,
#>     "Pr(>|t|)": 0.4664
#>   },
#>   {
#>     "name": "x",
#>     "Estimate": 2.3107,
#>     "Std. Error": 0.4248,
#>     "t value": 5.4391,
#>     "Pr(>|t|)": 0.0006
#>   }
#> ]

So we have our target function now:

fit_lm <- function(data) {
  data <- jsonlite::fromJSON(data)
  fit <- lm(y ~ x, data)
  lm_coef <- as.data.frame(summary(fit)$coefficients)
  lm_coef <- cbind(name = rownames(lm_coef), lm_coef)
  rownames(lm_coef) <- NULL
  lm_coef
}

Note that the target function must deserialise the json itself. This is so that arguments can be passed to jsonlite::fromJSON easily to control how deserialisation happens. We may support automatic deserialisation later as an argument to porcelain::porcelain_input_body_json.

The endpoint is not that much more involved than before though we have interesting inputs and outputs, with schemas required for both

endpoint_lm <- porcelain::porcelain_endpoint$new(
  "POST", "/lm", fit_lm,
  porcelain::porcelain_input_body_json("data", "lmInputs", schema_root),
  returning = porcelain::porcelain_returning_json("lmCoef", schema_root))

The input schema, lmInputs.json is

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "id": "lmInputs",
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "x": {"type": "number"},
            "y": {"type": "number"}
        },
        "required": ["x", "y"],
        "additionalProperties": false
    }
}

while the response schema lmCoef.json is

{
    "$schema": "http://json-schema.org/draft-04/schema#",
    "id": "lmCoef",
    "type": "array",
    "items": {
        "type": "object",
        "properties": {
            "name": {"type": "string"},
            "Estimate": {"type": "number"},
            "Std. Error": {"type": "number"},
            "t value": {"type": "number"},
            "Pr(>|t|)": {"type": "number"}
        },
        "required": ["name", "Estimate", "Std. Error", "t value", "Pr(>|t|)"],
        "additionalProperties": false
    }
}

These are both fairly strict schemas using both required and additionalProperties. You might want to be more permissive, but we find that strictness here pays off later.

api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_lm)

To exercise the API endpoint we need to pass in our input JSON (not an R object).

json <- jsonlite::toJSON(data)
json
#> [{"x":0.4023,"y":1.4242},{"x":0.1957,"y":-0.098},{"x":0.4035,"y":0.9608},{"x":0.0637,"y":-0.4316},{"x":0.3887,"y":0.6208},{"x":0.9755,"y":1.9353},{"x":0.2899,"y":0.7427},{"x":0.6784,"y":1.0825},{"x":0.7353,"y":1.6111},{"x":0.196,"y":0.5008}]

api$request("POST", "/lm", body = json, content_type = "application/json")
#> $status
#> [1] 200
#> 
#> $headers
#> $headers$`x-request-id`
#> [1] "6fd8f36d-39c3-4f20-a997-501fc2b7ceed"
#> 
#> $headers$`Content-Type`
#> [1] "application/json"
#> 
#> $headers$`X-Porcelain-Validated`
#> [1] "true"
#> 
#> 
#> $body
#> [1] "{\"status\":\"success\",\"errors\":null,\"data\":[{\"name\":\"(Intercept)\",\"Estimate\":-0.1655,\"Std. Error\":0.2164,\"t value\":-0.7648,\"Pr(>|t|)\":0.4664},{\"name\":\"x\",\"Estimate\":2.3108,\"Std. Error\":0.4249,\"t value\":5.4381,\"Pr(>|t|)\":0.0006}]}"

POST endpoint, inputs as binary, returning binary

(This example also shows off a few other features)

Handling binary inputs and outputs is supported, provided that you can deal with them in R. In this example we’ll use R’s serialisation format (rds; see ?serialize and ?saveRDS) as an example, but this approach would equally work with excel spreadsheets, zip files or any other non-text data that you work with.

In this example we’ll take some serialised R data and create a png plot as output. We’ll start by writing our target function:

binary_plot <- function(data, width = 400, height = 400) {
  data <- unserialize(data)
  tmp <- tempfile(fileext = ".png")
  on.exit(unlink(tmp))
  png(tmp, width = width, height = height)
  tryCatch(
    plot(data),
    finally = dev.off())
  readBin(tmp, raw(), n = file.size(tmp))
}

Here, we use unserialize to convert the incoming binary data into something usable, plot to a temporary file (which we clean up later, using on.exit). Using tryCatch(..., finally = dev.off()) ensures that even if the plotting fails, the device will be closed. Finally, readBin reads that temporary file in a raw vector.

So, for example (using str to limit what is printed to screen)

bin <- serialize(data, NULL)
str(binary_plot(bin), vec.len = 10)
#>  raw [1:8570] 89 50 4e 47 0d 0a 1a 0a 00 00 ...

It’s hard to tell this is a png, but the first few bytes give it away (the magic number 89 50 4e 47 0d 0a 1a 0a is used at the start of all png files).

endpoint_plot <- porcelain::porcelain_endpoint$new(
  "POST", "/plot", binary_plot,
  porcelain::porcelain_input_body_binary("data"),
  returning = porcelain::porcelain_returning_binary())
api <- porcelain::porcelain$new(validate = TRUE)$handle(endpoint_plot)

Making the request (again using str to prevent printing thousands of hex characters)

str(api$request("POST", "/plot", body = bin,
                 content_type = "application/octet-stream"),
    vec.len = 10)
#> List of 3
#>  $ status : int 200
#>  $ headers:List of 3
#>   ..$ x-request-id         : chr "c52393ba-20fe-4bef-acf7-c9ef95ca8c51"
#>   ..$ Content-Type         : chr "application/octet-stream"
#>   ..$ X-Porcelain-Validated: chr "true"
#>  $ body   : raw [1:8570] 89 50 4e 47 0d 0a 1a 0a 00 00 ...