{{> header }} Lantern | API Documentation

Lantern API

The Lantern API gives you the ability to create new requests for compliance information, and to query those requests, monitor their progress, and to query your account for information about all your requests.

The base url for the API is: https://api.cottagelabs.com/service/lantern

In order to use the Lantern API you'll need your API Key. If you don't yet have an API key, please contact us.

The following table summarises the API endpoints available to you:

Endpoint Method Summary
/ GET Confirms that the service is present
/ POST Allows you to create a new job by sending a suitable JSON document
/:job GET Show the current job, containing all the identifier information provided when it was created
/:job/progress GET Information about the progress of the :job
/:job/todo GET Information about the items in the request which are still pending
/:job/results GET Returns the processed records, as JSON or CSV (specify ?format=csv)
/jobs/:email GET Information for all jobs under the specified account address
/quota/:email GET Information about the quota for the specified account address

Creating a new Job

To create a new job, simply POST the JSON of your request to the root of the api:

POST /?apikey=<api key>
Content-Type: application/json; charset=UTF-8

<Job request body>
The job request is JSON of the form:
{
    "email": "<your email address>",
    "filename": "<the name you want to give the data you upload>",
    "list" : [
        {
            "Article title" : "<title of the article>",
            "DOI" : "<article DOI>",
            "PMCID" : "<Europe PMC identifier for the article (starting with 'PMC')>",
            "PMID" : "<PubMed identifier for the article>"
        },
        ... more article identifier records ...
    ]
}

In order for a requested in the list to be valid, it must contain at least one of the 4 identifiers, and it is strongly recommended that you do not submit requests only for titles.

You may supply a maximum of 3000 identifiers per job (assuming your account limit allows it).

In response you will receive something of the following form:

{
    "data": {
        "max": 3000,
        "job": "<your job id - you will need this in other API requests>",
        "length": 20,
        "quota": {
            "count": 20,
            "available": 99980,
            "premium": False,
            "additional": 0,
            "admin": False,
            "max": 100000,
            "display": False,
            "until": False,
            "allowed": True,
            "email": "<your email address>"
        }
    }
}

The response always contains information about your account quota for reference if you are going to make onward requests. See the section Check your account quota for details on the meaning of the values.

Get information about a current job

This allows you to retrieve the list of identifiers associated with a given Job.

GET /:job?apikey=<api key>

This will return something of the following form:

{
    "data": {
        "list": [
            {
                "process": "<per-identifier process id>",
                "PMID": "<pubmed id>",
                "DOI": "<doi>",
                "pmcid": "<EuropePMC id>",
                "title": "<Article title>"
            },
            ... all identifiers in this job ...
        ]
    }
}

The "per-identifier process id" is not actionable, it is informational only.

Get a progress report of the current job

This allows you to track the ongoing progress of your job

GET /:job/progress?apikey=<api key>

You will get back a JSON document of the following structure:

{
    "data": {
        "progress": 0,
        "_id": "<your job id>",
        "email": "<your account email address>"
    }
}

The progress gives you a floating point number between 0 and 100, which is the % completeness of the job.

Get a list of identifiers which still remain to be processed

This will give you back a subset of the identifiers in your original job that still remain to be processed.

GET /:job/todo?apikey=<api key>

You will get back a JSON document of the following structure:

{
    "data": [
        {
            "process": "<per-identifier process id>",
            "PMID": "<pubmed id>",
            "DOI": "<doi>",
            "pmcid": "<EuropePMC id>",
            "title": "<Article title>"
        },
        ... all unprocessed identifiers in this job ...
    ]
}

Get a list of the processed records and their results

When you are ready to download your full results (when the progress endpoint indicates 100%), or your partial results (at any time during the job processing run), you can request them from this endpoint

GET /:job/results?apikey=<api key>&format=<json|csv>

If you request format=json (the default, if you omit it) you will get back a JSON document of the following structure:

{
    "data": [
        {
            "_id": "<your job id>",
            "process": "<your job id>",
            "createdAt": <timestamp of job creation time>,

            "doi": "<doi>",
            "DOI": "<doi>",
            "pmcid": "<EuropePMC id>",
            "pmid": "<pubmed id>",
            "PMID": "<pubmed id>",
            "title": "<Article title>",

            "publisher": "<Publisher name>",
            "author": [
                {
                    "fullName": "<Author's full name>",
                    "firstName": "<Author's first name>",
                    "lastName": "<Author's last name>",
                    "initials": "<Author's initials>,
                    "affiliation": "<Author's affiliation, as a descriptive string>",
                    "authorId": {
                        "type": "ORCID (or other author identifier)",
                        "value": "<Author's identifier>"
                    }
                }
            ],
            "electronicPublicationDate": "<Electronic publication date in the form YYYY-MM-DD>",

            "journal": {
                "in_doaj": true|false,
                "title": "<Journal title>",
                "issn": "<Journal ISSN (may be print or electronic)>",
                "eissn": "<Journal E-ISSN>",
                "dateOfPublication": "<Publication date in the form YYYY-MM-DD>"
            },

            "in_epmc": true|false,
            "is_aam": true|false,
            "is_oa": true|false,
            "aheadofprint": true|false,
            "has_fulltext_xml": true|false,

            "licence": "<Licence String (e.g. CC-BY)>",
            "licence_source": "epmc_html|other code for where the licence was found",
            "epmc_licence": "<Licence String (e.g. CC-BY)>",
            "epmc_licence_source": "epmc_html|other code for where the licence was found",
            "publisher_licence_check_ran": true,
            "publisher_licence": "unknown",

            "romeo_colour": "green|blue|yellow|white",
            "embargo": {
                "preprint": <number of months>,
                "postprint": <number of months>,
                "pdf": <number of months>,
            },
            "archiving": {
                "preprint": "can|cannot|unknown|other status from romeo",
                "postprint": "can|cannot|unknown|other status from romeo",
                "pdf": "can|cannot|unknown|other status from romeo",
            },

            "repositories": [
                {
                    "name": "<Repository Name>",
                    "url": "<repository URL>",
                    "fulltexts": [
                        "<URLs for repository items>"
                    ]
                }
            ],

            "grants": [
                {
                    "grantId": "<Grant Number>",
                    "agency": "<Funder Name>",
                    "orderIn": 0,
                    "PI": "<Principal Investigator>",
                    "acronym": "<Funder Acronym>"
                }
            ],

            "confidence": <number between 0 and 1 for our confidence we have identified the right article from the identifiers supplied>,
            "provenance": [
                "<Human-readable data provenance notes, generated during processing>>",
                ...
            ]

        }

If you request format=csv you will get back a CSV which will be formatted as per the user documentation.

List your current and previous jobs

If you want to see a history of the jobs you've created in the system, you can use the following:

GET /jobs/:email?apikey=<api key>

(don't forget to URL encode your email address)

You'll get a response of the following form:

{
    "data": {
        "total": 14,
        "jobs": [
            {
                "_id": "<job id>",
                "email": "<your email>",
                "createdAt": <timestamp of created date of job>
                "processes": <number of identifiers in job>,
                "refresh": 1,
                "done": true|false
            },
            ... all your jobs ...
        ]
    }
}

Check your account quota

All accounts have quotas depending on their priviledge level. You can use this endpoint to check you total monthly quota, and how much of it you've currently used

GET /quota/:email?apikey=<api key>

(don't forget to URL encode your email address)

You'll get a response of the following form:

{
    "data": {
        "email": "<your email>",

        "max": 5000,
        "additional": 0,
        "until": "<expiry date of additional allowance>",
        "count": 280,
        "available": 4720,

        "premium": true|false,
        "admin": true|false,
        "display": true|false,
        "allowed": true|false
    }
}

This example shows a Premium Account (with a maximum allowance of 5000 identifiers per month) which has run 280 identifiers this month, leaving 4720 remaining.

The fields have the following meanings:

  • max - Monthly allowance
  • additional - additional, temporary monthly allowance
  • until - expiry date of additional allowance
  • count - number of identifiers processed so far this month
  • available - number of identifiers remaining this month
  • premium - is this is premium account
  • admin - is this an admin account (unless you work for us, it probably won't be!)
  • display - whether this account is discoverable
  • allowed - whether this account is currently active