Wattzon SNAP API Documentation

WattzOn Snap API v1 Documentation

WattzOn Snap is a software-as-a-service (SaaS) platform for extracting
usage data from images of bills from residential utility providers.
This document describes the individual Snap API calls and their features.

The Snap API uses JSON as a serialization format. As such, it requires no
specific software to use. Authentication and communication proceed over
TLS; clients must present a certificate.

Definitions

  • Client
    The software interacting with the WattzOn Snap API.

Service Introduction

The API provides for:

  • Creating a Job (uploading PDF file of a utility bill)
  • Retrieving a list of recent Jobs
  • Retrieving the results of a specific Job
  • Retrieving bill images associated with a Job

Results are returned as JSON objects or HTTP error codes.

Authentication

All client identification and authorization is made through a client-side TLS
certificate. You do not need to purchase a certificate; WattzOn will act as
a certificate authority for its own application. The procedure is as
follows:

  1. Generate a CSR (Certificate Signing Request). Here’s how to do it with openssl:
openssl req -new -newkey rsa:2048 -nodes -keyout client.key -out client.csr

Don’t enter a challenge password. This creates two files: client.csr (the
CSR) and client.key (the private key).

  1. Send the CSR to your WattzOn technical or sales contact. Do NOT send
    the private key; keep it secret.

  2. WattzOn will sign the CSR and send you a certificate named client.crt.
    You can now use this in conjunction with your private key (client.key) to
    connect to the Snap API. However, depending on your client software, you may
    need to perform another step to create a combined certificate-key file.

  • Some HTTPS clients may require a .pfx/.pf12 file, a DER file, or
    another format. OpenSSL can convert to these formats.

NOTE: TLS certificates expire after one year. To avoid an interruption in service, arrange
to have a new certificate signed before your current one expires. Your existing signed certicate
will continue to work until it expires, giving you time to deploy the new certificate.

API Return Codes

Successful calls will return an HTTP 2xx code (typically HTTP 200 (Successful), HTTP 201 (Created),
or HTTP 202 (Accepted)). Unsuccessful calls will return one of the following HTTP codes:

Code Meaning
400 Bad Request
401 Unauthorized. The certificates are not present or are not correct for this client
403 Forbidden. The client does not have access to the requested resource
404 Not found. The requested resource does not exist
429 Too Many Requests. Rate Limit has been exceeded
500 Internal Server Error

Job Service

Job Statuses

A Job can be in one of several states, as reflected by the status attribute. Newly created jobs are either
ENQUEUED or, if the provider is not supported, UNSUPPORTED (this status indicates that uploaded documents
are to be held without processing).

Once a Job has entered the processing pipeline, its status becomes PROCESSING.

After processing, the job status will be either COMPLETE, indicating that results are available, or, in very rare
circumstances, FAILED, indicating that something has gone unrecoverably wrong with processing.

UNSUPPORTED, COMPLETE, and FAILED are ‘final’ states, meaning that no further processing will happen. Snap may,
over time, add additional statuses.

Create Job

path: /api/v1/jobs/

method: POST

input parameters:

  • email: A valid email address for the customer of the job.
  • provider: An utility provider identifier.
  • tags: A list of tags to associate with this job (optional).
  • document: A dictionary of the PDF file to process, which contains:
    • name: The name of the file, with extension.
    • content: The base64 encoded contents of the file.

Tags are strings. (Numbers will be converted to strings). Each is limited to 255 characters, and should not
contain commas. Limit yourself to alphanumerics, colons, and dashes, and you’ll be fine.

Document size is limited to 5MB. Exceeding size limit results in 400 (Bad Request) return code.

output:

  • on success:

    • HTTP 201 (Created)
    • id: The identifier number of the job.
    • created_at: The time (in UTC) that the job was created.
    • email: The email address provided when the job was created.
    • provider: The ID of the bill’s provider.
    • status: Either ENQUEUED or, if the provider is not supported, UNSUPPORTED.
    • tags: The list of tags to associate with this job.
    • document: An object describing the uploaded document.
  • on error:

    • HTTP 404 (Not Found) if provider id is not in database.
    • HTTP 400 (Bad Request) if one or more input parameters are missing or have invalid content.
    • error: error message.

Example:

pdfcontent=$(base64 bill.pdf)
postdata='{"email":"user@example.com","tags":["sfid:12345","demo"],"document": {"name":"bill.pdf","content":"'$pdfcontent'"},"provider":4}'
curl -X POST --key client.key --cert client.crt \
-H "Content-Type: application/json" \
https://snap.wattzon.com/api/v1/jobs/ \
-d @<(echo $postdata)

which returns

{
    "id": 513,
    "created_at": "2016-07-25T16:26:12.320000",
    "email": "user@example.com",
    "provider": 4,
    "status": "ENQUEUED",
    "tags": ["sfid:12345", "demo"],
    "document" : {"id": 259, "name":"bill.pdf"}
}

Note that it typically takes at least 30 seconds to process a job.

Retrieve a List of Recent Jobs

path: /api/v1/jobs/

method: GET

input: (optional) tag

output:

  • HTTP 200 (OK) on success.
  • A list of Job summaries. Each element contains:
    • id: The identifier number of the job.
    • created_at: The date and time (in UTC) that the job was created.
    • email: The email address provided when the job was created.
    • provider: The provider code supplied when the job was created.
    • status: The job status, as described above.
    • tags: The list of tags associated with a job.
    • document: An object describing the uploaded document.

The document object contains the following fields:

  • id: The id that identifies the uploaded document.
  • name: The name of the file.

The most recent 20 Jobs are returned, ordered by submission time, with most recently submitted job first. (At times
more than 20 Job summaries may be returned, but this version of the API guarantees only the most recent 20 summaries.)

A future version of the API specification will describe a way to page through job summaries.

Example:

curl -X GET --key client.key --cert client.crt https://snap.wattzon.com/api/v1/jobs/

will return something like:

[
    {
        "id": 513,
        "created_at": "2016-07-25T16:26:12.320000",
        "email": "user@example.com",
        "provider": 4,
        "status": "PROCESSING",
        "tags": ["sfid:12345", "demo"],
        "document" : {"id": 259, "name":"bill.pdf"}
    },
    {
        "id": 497,
        "created_at": "2016-07-25T16:16:12.320000",
        "email": "user@example.com",
        "provider": 4,
        "status": "COMPLETED",
        "tags": [],
        "document" : {"id": 250, "name":"bill_1.pdf"}
    }
]

Filtering the Job List by tag

Adding ?tag=value to the URL will restrict the search to jobs that have “value” in their tag lists.

Retrieve Results for a Specific Job

path: /api/v1/jobs/<id>/

method: GET

input: id: The identifier number of the job

output:

  • on success:

    • HTTP 200 (OK).
    • id: The identifier number of the job.
    • created_at: The date and time (in UTC) that the job was created.
    • email: The email address provided when the job was created.
    • provider: The provider code supplied when the job was created.
    • status: The job status, as described above.
    • tags: The list of tags associated with a job.
    • document: An object describing the uploaded document.

    The document object contains the following fields:

    • id: The id that identifies the uploaded document.
    • name: The name of the file.

    If the job is complete, the output will additionally contain:

    • results: A dict with the information extracted from the document.
  • on error:

    • HTTP 404 (Not Found) if job id is not found.
    • error: error message.

Example:

curl -X GET --key client.key --cert client.crt  https://snap.wattzon.com/api/v1/jobs/578/

will return something like:

{
  "id": 578,
  "created_at": "2016-06-12T04:15:40.746000",
  "email": "user@example.com",
  "provider": 4,
  "status": "COMPLETE",
  "tags": ["sfid:99999"],
  "document": {"id":5796,"name":"bill.pdf"},
  "results": {
    "Previous_balance": {
        "tokens": {
            "raw": [
                { "page_number": 1, "x1": 1315, "y2": 769, "index": 19, "y1": 730, "x2": 1454 }
            ],
            "content": "$65.04"
        }
    },
    "Name": {
        "tokens": {
            "raw": [
                { "page_number": 1, "x1": 1546, "y2": 376, "index": 59, "y1": 346, "x2": 1650 },
                { "page_number": 1, "x1": 1666, "y2": 384, "index": 60, "y1": 346, "x2": 1801 }
            ],
            "content": "DAVE SMITH"
        }
    },
    "Account_number": {
        "tokens": {
            "raw": [
                { "page_number": 1, "x1": 1643, "y2": 554, "index": 64, "y1": 524, "x2": 1869 }
            ],
            "content": "9999999999"
        }
    },
    "Bill_date": {
        "tokens": {
            "raw": [
                { "page_number": 1, "x1": 338, "y2": 514, "index": 5, "y1": 471, "x2": 601 },
                { "page_number": 1, "x1": 619, "y2": 522, "index": 6, "y1": 472, "x2": 658 },
                { "page_number": 1, "x1": 682, "y2": 513, "index": 7, "y1": 472, "x2": 805 }
            ],
            "content": "November 5, 2015"
        }
    },
    "Elec_Meter_number": {
        "tokens": {
            "raw": [
                { "page_number": 3, "x1": 1615, "y2": 488, "index": 21, "y1": 458, "x2": 1759 }
            ],
            "content": "E123456"
        }
    },
    "Elec_Usage": {
        "tokens": {
            "raw": [
                { "page_number": 3, "x1": 2387, "y2": 2121, "index": 263, "y1": 2091, "x2": 2429 }
            ],
            "content": "89"
        }
    },
    ...
  }
}

Request transformed results

By default, raw results are returned. Appending ?results=transformed to the URL will cause the API to return transformed results instead.

Example:

curl -X GET --key client.key --cert client.crt  https://snap.wattzon.com/api/v1/jobs/578/?results=transformed

will return something like:

{
  "id": 578,
  "created_at": "2016-06-12T04:15:40.746000",
  "email": "user@example.com",
  "provider": 4,
  "status": "COMPLETE",
  "tags": ["sfid:99999"],
  "document": {"id":5796,"name":"bill.pdf"},
  "results": {
    "Previous_balance": "$65.04",
    "Name": "DAVE SMITH",
    "Account_number": "9999999999",
    "Bill_date": "November 5, 2015",
    "Elec_Meter_number": "E123456",
    "Elec_Usage": "89",
    ...
  }
}

Retrieving a Bill Image

path: /api/v1/jobs/<job_id>/document/?raw=yes

method: GET

input:

  • job_id: The identifier number of the job.

output:

  • on success:

    • HTTP 200 (OK).
    • id: The ID of the file.
    • name: The name of the file.
    • content: The base64-encoded contents of the file.

    If the raw=yes query string argument is passed, no JSON structured will be returned, but the raw binary
    data of the file.

  • on error:

    • HTTP 404 if job or file does not exist.
    • error: error message.

Example:

curl -X GET --key client.key --cert client.crt https://snap.wattzon.com/api/v1/jobs/578/document/

will return something like:

{
  "id": 579656,
  "name": "bill.pdf",
  "content": "..."
}

where ... is a very long base64 encoded string representing the contents of the image.

Utility Providers

WattzOn Snap utility providers have the same ID as Link utility providers.
If you don’t know your provider id you can search for them using Link API.

NOTE: The endpoints are the same as in Link API documentation.

Query ZIP Information

path: https://api.wattzon.com/link/4.0/zips/<zipcode>

method: GET

input:

  • zipcode: ZIP code

output:

  • on success:

    • HTTP 200 on success.
    • zipcode: ZIP code
    • city: city where zipcode is located
    • county: county where zipcode is located
    • state: state where zipcode is located
    • latitude: latitude of location
    • longitude: longitude of location
  • on error:

    • HTTP 404 if ZIP (code) not found.
    • error: error message.

Example:

curl -X GET --key client.key --cert client.crt https://api.wattzon.com/link/4.0/zips/00007

Find utility provider

path: https://api.wattzon.com/link/4.0/utilities

method: GET

Query Parameters:

  • zipcode: ZIP code (optional if name parameter present)
  • name: name to match (optional if zipcode parameter present)
  • type: utility type (e.g. electric, gas, water) (optional)

NOTE: It’s required to have either zipcode or name.

output:

  • on success:

    • HTTP 200 on success.
    • provider_ids: list of utility provider IDs (WattzOn IDs; not LSE IDs)
  • on error:

    • HTTP 404 if Utility Provider not found
    • error: error message.

Example:

curl -X GET --key client.key --cert client.crt https://api.wattzon.com/link/4.0/utilities/?zipcode=00007

Query Utility Provider Information

path: https://api.wattzon.com/link/4.0/utilities/<provider_id>

method: GET

input:

  • provider_id: utility provider ID (e.g. from a find call)

output:

  • on success:

    • HTTP 200 on success.
    • name: utility name
    • type: utility types supported by the utility: comma-separated list of types (e.g. electric, gas, water)
    • lseid: LSE ID (if available)
    • homepage_url: link to the provider’s website
  • on error:

    • HTTP 404 if Utility Provider not found
    • error: error message.

Example:

curl -X GET --key client.key --cert client.crt https://api.wattzon.com/link/4.0/utilities/3150

Notification Support

Purpose

After starting a utility provider data extraction, there are two options for checking on the extraction job status. The first is Snap’s Retrieve Job Results API call, which is ideal for interactive applications; it’s immediate and can provide insight into running jobs as well as the status of completed job.

The second option is a notification via Amazon’s SQS (Simple Queue Service) mechanism. This is especially useful when the data consumer’s API access is decoupled from the end user interface (WattzOn 3in1 users, for example), and needs a way to track extraction jobs.

Configuration

You’ll need to provide WattzOn with an Amazon SQS endpoint and associated access ID. Your technical contact can help you with this.

When configured, SQS notifications are sent upon extraction job completion.

Message Format

An SQS notification is in JSON format. Here’s an example for a successful extraction:

{
    "app": "snap",
    "job_id": 513,
    "tags": "['sfid:12345', 'demo']",
    "created_at": "2018-01-01 20:40:00.800922+00:00",
    "final_state": "COMPLETE",
    "provider": 4,
    "email": "user@example.com",
    "document": {"id": 259, "name":"bill.pdf"},
    "got_results": true
}

These fields are:

  • app: The application (in our case, always snap).
  • job_id: Extraction Job ID.
  • tags: List of tags associated with a job.
  • created_at: Timestamp (in UTC) of notification message.
  • final_state: End-of-job status code. See Job Statuses for a list.
  • provider: The provider code supplied when the job was created.
  • email: The email address provided when the job was created.
  • document: An object describing the uploaded document.
  • got_results: true if results are available, false otherwise.

Revision

  • 12/14/2018: Initial documentation of the Snap v1 API.