API reference

1.Introduction

This document is intended to provide the necessary background information to successfully use the Sting Vision API. Users with only limited experience should be able to make correct calls to our server with little effort.  Code examples are written in python, examples in other languages are available and will be included soon. Any questions or remarks can be send to info@stingvision.com.

2.API basics

2.1.API endpoint

All requests need to be send to our server. The domain for all API calls is:


# API domain
url = 'https://api.stingvision.com/API/'

2.2.Responses

Some algorithms will respond with a returned image, but most with some structured data. The format of the data is a JSON object by default. The data will be returned in the body of the response object. The responses will always have a ‘success’ boolean and in case of an unsuccessful request, there will also be an error text included with the response to give some additional debug information.

The server uses the standard HTTP status codes. Below you can find context for some of the status codes:

HTTP 200: OK, request received and processed.
HTTP 400: Bad request, required parameters not present or bad syntax.
HTTP 402: Payment required, you have exceeded the free request limit.
HTTP 403: Forbidden, request was not authenticated, supply credentials or login.
HTTP 405: Method not allowed, occurs when sending for example a GET request.
HTTP 408: Request timeout, the response takes too long, contact our office.
HTTP 413: Payload too large, the uploaded file exceeds 2.5 MB.
HTTP 415: Unsupported media type, the image/video file provided is not supported or corrupt.
HTTP 429: too many requests, too many requests are send in a certain time frame.

Sting Vision specific status codes:

HTTP 460: Algorithm unknown, there might be a typo in the algorithm name.

2.3.Authentication

Every request needs to be authenticated, there are two ways to make sure your request is accepted by the server.

2.3.1.On the fly authentication

The first method  is sending your username and password with the request. The server will process the credentials and if authentication is successful process the request. This will make the process slower.

An example of a request in Python using this method looks like this:

 
# Import libraries 
import requests 
 
# Define the URL to the API endpoint 
url = 'https://api.stingvision.com/API/classify_image' 
 
# Construct credentials 
credentials = {'username': 'john', 'password': 'secret_password'} 
 
# Construct image payload 
payload = {'image': open('lena.jpg', 'rb')} 
 
# Send request 
r = requests.post(url, files=payload, data=credentials) 

We start with importing the library and defining the API endpoint. Next we make a dictionary object with our credentials on line 8. On line 11 we read a JPG-image from disk and also store it in a dictionary object. We now have all the elements to complete a successful request. On line 14 the actual request is send where the image is stored in “files” and the credentials in “data”, make sure you use these keys since the server will look for them. Later on we will handle the response data from the server.

2.3.2.Create session

The second method is to log in to the server and retrieve a secret session_key.  You can provide your session_key with a request and it is used to validate access, which is a much faster process. While logging in the user can define the alive time for the session_key to increase security or increase stability. By default the session_key will be destroyed 24 hours after creation.

An example logging in with python looks like this:

# Import libraries 
import requests 
 
# Define the URL to the API endpoint 
url = 'https://api.stingvision.com/API/login' 
 
# Construct credentials 
credentials = { 'username': 'john', 'password': 'secret_password'} 
 
# Send request 
r = requests.post(url, data=credentials) 
 
# Print response data
print(r.content) 

The API endpoint is now ‘login’ and the credential object is the same as before. There is no need to include image data because this is an administrative call which does not perform any processing.

By default the server will return a JSON object containing a session_key and a success boolean. The response is accessed in line 14 by reading the content of the response object r.


{'session_key': 'd6afcf6b5d66d9a76df56e8fcfdbe01effd9fa78', 'success': true} 

If there is something missing in the credentials, or you misspelled the dictionary keys, an “error” key will be added and you get a JSON object like:

 
{'success': false, 'error': 'Password not provided'} 
{'success': false, 'error': 'Username not provided'} 
{'success': false, 'error': 'Username and password not provided'} 

If your credentials are supplied, but not correct a HTTP error 403: Forbidden will be issued. The HTTP error code 401: unauthorized is not used because the server will not include a WWW-Authenticate header field.

2.3.2.1.Optional parameter device_id

When a device_id is provided within the login request the server will check if a valid session is still available and will return the corresponding session_key. This is useful when the device that logs in cannot write to non-volatile memory. If a session is found, but not valid any more, the session will be destroyed and recreated with a new stay_alive_abs time.  It is the responsibility of the user to make sure the device_id’s used are unique.

2.3.2.2.Optional parameter stay_alive_abs

By default the session_key is valid for 24 hours after creation. When a request is made after this time the request will be rejected with a HTTP 403 code. If it is desirable to change this stay alive time, add a key named “stay_alive_abs” to the request data with a number that represents the number of minutes after the creation time before the session_key is destroyed.


# Import libraries
import requests

# Define the URL to the API endpoint
url = 'https://api.stingvision.com/API/login'

# Construct credentials
credentials = {'username': 'john', 'password': 'secret_password', 'stay_alive_abs': 10080 }

# Send request
r = requests.post(url, data=credentials)

The example above will create a session that will be kept alive for a week after creation. The following table can be used to set the “stay_alive_abs” parameter.

Number of minutes Alive time
60 1 hour
1440 1 day (default)
10080 1 week
43200 30 days
525600 1 year
528480 1 leap year plus 1 day, this is the maximum input that is accepted by the server
-1 Never destroy the session

For security reasons it is recommended to use an alive time as small as possible, if someone can come across your session_key they will be able to perform requests on your account. Make sure you keep your session_key secret.

 

2.3.2.3.Destroy session

A session can also be terminated by the client with a log out request as follows:


# Import libraries
import requests

# Define the URL to the API endpoint
url = 'https://api.stingvision.com/API/logout'

# Construct credentials
credentials = {'session_key': 'd6afcf6b5d66d9a76df56e8fcfdbe01effd9fa78'}

# Send request
r = requests.post(url, data=credentials)

The server will destroy the session, a successive request to the server using the old session_key will be rejected by the server with a HTTP 403 response. It is recommended to log out when a long idle period is expected.

3.API features

 

The Sting vision API is a comprehensive collection of computer vision algorithms.

The name of the algorithm is also the endpoint of the request. In the next chapters we will discuss the different algorithms and how to use them. Every chapter begins with a quick reference table containing the request endpoint, required input, optional inputs and the response form.

We support JPEG, PNG, TIF and BMP images. We recommend using JPEG or PNG since these will generally be smaller and will reduce network load.

 

 

3.1.Classify image

 

Endpoint “https://api.stingvision.com/API/classify_image”
Method POST
Required Single image, PNG, JPG, BMP or TIF
Optional return_count
return_decimals
Response JSON array with classifications:  [{‘class’: ‘text’, ‘score’: number}, {‘class’: ‘text’, ‘score’: number} …]

The image classification algorithm is an algorithm that can classify a single dominant object (in terms of size) contained in an image. An example is shown below.
Classification Result

The algorithm needs a single image as input included in the “files” attribute of the POST request body. The key name of the image must be “image”, the server looks for this key.
A JSON object is returned with some of the top predictions including their individual certainty. By default the algorithm will return the top 5 predictions it has made and provides the classes with the prediction certainty expressed with 5 decimals.

3.1.1.Optional parameter return_count

The number of predictions can be set using the optional  “return_count” parameter. The default is 5 and valid input is a number between 1 and 10.

3.1.2.Optional parameter return_decimals

The prediction certainty is a number between 0 and 1 and by default has 5 decimals. The number of decimals can be modified by using the parameter return_decimals. The lower limit for this parameter is 1 and the upper limit is 10. When return_decimals is for example set to 5 it can happen that a number with four decimals is returned, the fifth decimal is zero and thus discarded. When the scores are very low it can occur that the score will be returned in scientific notation.  Make sure this doesn’t impose any problems in your software, usually these predictions are discarded.

3.1.3.Example code and output

# Import libraries 
import requests 
 
# Define the URL to the API endpoint 
url = 'https://api.stingvision.com/API/classify_image' 
 
# Construct the image payload 
payload = {'image': open('car.jpg', 'rb')} 
 
# Construct the optional algorithm parameters 
parameters = {'return_count': 5, 'return_decimals': 5, 'session_key': 'd6afcf6b5d66d9a76df56e8fcfdbe01effd9fa78' } 
 
# Send request 
r = requests.post(url, files=payload, data=parameters) 

The server will respond with a JSON array of objects. Each object is a prediction and contains the class name and the prediction score.  An example of the response is given below.


[{'class': 'sports car, sport car', 'score': 0.57385}, {'class': 'limousine, limo', 'score': 0.11086}, {'class': 'grille, radiator grille', 'score': 0.07677}, {'class': 'car wheel', 'score': 0.04356}]

For the next example the following photo (without the result footer) is send with the request, the results and JSON response can also be found below.
Classification results of a park bench

JSON response:

 
[{'class': 'park bench', 'score': 0.99393}, {'class': 'swing', 'score': 0.00177}, {'class': 'ashcan, trash can, garbage can', 'score': 0.00018}, {'class': 'lakeside, lakeshore', 'score': 0.00013}, {'class': 'folding chair', 'score': 8e-05}] 

 

 

3.1.2.OCR

 

Endpoint “https://api.stingvision.com/API/ocr”
Method POST
Required Single image, PNG, JPG, BMP or TIF
Optional
Response JSON object: {‘success’: True, ‘result’: ‘text’}

Optical Character Recognition (OCR) is used to read text from images. Our method is trained using many different fonts and languages. It can be used to implement a document scanner or search application that interprets written information. The algorithm can be used to recognize handwriting, but for the best results a specific handwriting font should be learned by the algorithm. In the future we will extend this algorithm to be able to learn your specific handwriting.

For the best results the text should be as straight as possible and there should be a high contrast between characters and background.


# Import libraries
import requests

# Define the URL to the API endpoint
url = 'https://api.stingvision.com/API/ocr'

# Construct the image payload 
payload = {'image': open('text.jpg', 'rb')}

# Construct the optional algorithm parameters 
parameters = {'session_key': 'd6afcf6b5d66d9a76df56e8fcfdbe01effd9fa78' } 
# Send request
r = requests.post(url, files=payload, data=parameters)

The default language is English, we will implement different languages soon. The following test image will give the result below:

OCR testimage

OCR Results for this image:

{'success': True, 'result': 'Our computer vision API\nwill help you create your\napplication. Even if you\nhave to read text from\nan image.\n\nOur computer Vision API\nwill help you create your\napplication. Even if you\nhave to read text from\nanimage.\n\nOur computer Vision API\nwill help you create your\napplication. Even if you\n\nhave to read text from\nan image.'}

Newlines that are detected are encoded with \n.

 

3.1.3.Face detection

Endpoint “https://api.stingvision.com/API/face_detection”
Method POST
Required Single image, PNG, JPG, BMP or TIF
Optional
Response JSON object: {‘success’: True, ‘result’: array with face objects containing top, bottom, right, top pixel locations and detection score}

The face detection algorithm can detect faces in different poses on an image. The algorithm is not limited to a maximum number of faces, but the more face candidates that need to be processed, the more time the algorithm needs to complete.


# Import libraries
import requests

# Define the URL to the API endpoint
url = 'https://api.stingvision.com/API/face_detection'

# Construct the image payload 
payload = {'image': open('group.jpg', 'rb')}

# Construct the optional algorithm parameters 
parameters = {'session_key': 'd6afcf6b5d66d9a78fcfdbe01effd9fa78' }

# Send request
r = requests.post(url, files=payload, data=parameters)

Performing the request above will result in a JSON response that contains an array of face objects. Each face object holds the top, right, bottom and left pixel border of the face which can be used to draw bounding boxes or extract the faces for further processing. There is also a detection score, the bigger, the more confident.

4.Examples Python

4.1.Simple object classification

In this example an image is send to the classify_image endpoint. The credentials are provided with the request and the classification is done using custom parameters.

# Import libraries
import requests
import json

# Construct endpoints
classificationURL = 'https://api.stingvision.com/API/classify_image'

# Construct credentials, replace with your own username and password
credentials = {'username': 'john', 'password': 'secret_password', 'return_count': 3, 'return_decimals': 3}

# Construct image payload
payload = {'image': open('car.jpg', 'rb')}

# Send request
r = requests.post(classificationURL, files=payload, data=credentials)

# Use results if request returns a status code 200
if r.status_code == 200:
    # Load the json response into python
    response = json.loads(r.content.decode('utf-8'))

    # Step through all the results and print the class and score
    for i in response['result']:
        print(i['class'])
        print(i['score'])
else:
    # In case of an unsuccessful request, print status code and content
    print(r.status_code)
    print(r.content)
 

The script above will print out the top-5 classifications. As expected, there are three classes returned with each a score with 3 decimals.


sports car, sport car
0.531
limousine, limo
0.166
grille, radiator grille
0.071

4.2.Object classification and OCR

This example shows how you can use the results from an API feature. First the image is classified and if the algorithm is confident enough we will use this information to look for text in the image.

# Import libraries
from PIL import Image
import requests
import json

# Construct endpoints
classificationURL = 'https://api.stingvision.com/API/classify_image'
ocrURL = 'https://api.stingvision.com/API/ocr'

# Construct credentials
credentials = {'username': 'john', 'password': 'secret_password'}

# Read image as a buffered reader object 
imagedata = open('website.jpg', 'rb')
# Construct image payload
payload = {'image': imagedata}

# Send request
r = requests.post(classificationURL, files=payload, data=credentials)

# Define a threshold for the classification score
threshold = 0.8

# Use results if request returns a status code 200
if r.status_code == 200:
    # Load the json response into python datatype
    response = json.loads(r.content.decode('utf-8'))

    # Get top-1 classification and score
    objectClass = response['result'][0]['class']
    objectScore = response['result'][0]['score']

    # Make sure the system is fairly confident of the classification
    if objectScore > threshold:
        print('Class found: ' + objectClass)
        if 'website' in objectClass:
            # Run OCR on class that has a high probability of having text in it

            # Because imagedata is a buffered reader object we need to return to the beginning of the file
            imagedata.seek(0)

            # Construct OCR request to server
            r2 = requests.post(ocrURL, files=payload, data=credentials)

            # Use result when successful
            if r2.status_code == 200:     
                # Parse the json string
                response = json.loads(r2.content.decode('utf-8'))

                # Retrieve the definition chapter
                tag = 'Definition [edit]'

                # Check if there is a definition in the image
                index = response['result'].find(tag) + len(tag)

                # If the tag is not found, the index will be -1
                if index > 0:
                    print('Definition found: ')
                    
                    # Split the text so we get the interesting part in the beginning
                    definition = response['result'][index:].strip()

                    # Print our result.
                    print(definition)
                else:
                    print('No definition found on image')
            else:
                # The ocr request was unsuccessful, print the server response so we can check where it went wrong
                print(r2.content)
        else:
            print('No text expected')
    else:
        print('Not confident enough to say what class I think it is')
            
else:
    # In case of an unsuccessful classification request, print status code and returned content
    print(r.status_code)
    print(r.content)

We provide the script with website.jpg, which is a partial screenshot of the Computer Vision wikipedia page, shown below.

In this case we know beforehand that there is a chapter called “Definition” so we filter the result from the server to isolate this chapter.


Class found: web site, website, internet site, site
Definition found:
Computer vision is an interdisciplinary field that deals with how computers can be made for gaining high-level understanding from digital images or videos. From the perspective of engineering, it seeks to
automate tasks that the human visual system can d0_[i][2][3] "Computer vision is ooncerned with the automatic extraction, analysis and understanding of useful information from a single image or a sequence of
images. it involves the development of a theoretical and algorithmic basis to achieve automatic visual understanding."[9] As a scientific discipline, computer vision is concerned with the theory behind artificial
systems that extract information from images. The image data can take many forms, such as video sequences, views from multiple cameras or multi-dimensional data from a medical scannerlw] As a
technological discipline computer vision seeks to apply its theories and models for the construction of computer vision systems.

History [edit]