Web Services (APIs)

Gears Icon

Open Context has a powerful Application Program Interface (API) that enables you to programmatically:

  • Search and browse data
  • Visualize data
  • Analyze data
  • Link to Open Context records (entity reconciliation)
  • Interoperate with Open Context

The discussions below provide background information on the APIs. For specific examples of the APIs in use, check out the Open Context API Cookbook.

API Introduction

Open Context has two primary APIs. One returns data on individual records, and the other returns data from searches. Both APIs share data in the JSON-LD format. JSON-LD means you can treat the information as simple JSON data, or you can parse it as RDF for Linked Data applications. Open Context supplements these primary JSON APIs with secondary APIs (see below) oriented around XML-data.

The general goal of the search/query/faceted-browse API is to provide clients with links described with useful information to:

  • Change state (further filter Open Context)
  • Get useful numeric summaries of filtered sets of Open Context records

This is all built on a model of faceted search / faceted browsing. In faceted search, the service returns information that summarizes a collection according to different metadata facets. This is the main way data in Open Context can be understood in aggregate.

Open Context's approach of exposing metaddata as links that you can follow provides a uniform interface for otherwise very diverse data. This common method allows a client or user to discover metadata links common across multiple projects or links of attributes specific to a single project. The aim of this common method is to help to simplify accessing, querying, and analyzing a wide variety of data.


Overview of Search and Query Services

Open Context has a variety of query services for different types of content. These all use the same query syntax and, if desired, will return JSON using the same schema. The different query services are as follows:

Other Web Services and APIs

Open Archives Initiative Logo

Open Context provides an Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH) endpoint. Many digital library systems, repositories and aggregators of scholarly content support this type of service.

Open Context's OAI-PMH service endpoint is located here: https://staging.opencontext.org/oai/

Feed Icon

Open Context has paused providing a feed using the Atom Syndication Format. This paged feed provides a list (manifest) of URI-identified content published by Open Context and metadata describing these content. Once reactivated, this feed will list these resources ordered by most recently updated records first. Digital libraries and other systems can crawl this feed to accession content from Open Context.

Open Context's manifest feed WILL BE located here: https://staging.opencontext.org/manifest/.atom


Caveats

Currently, Open Context supports "read-only" APIs and does not allow methods for posting or otherwise modifying data. Also, you are currently using mirror instance of Open Context. The JSON data will contain URIs to content at http://opencontext.org/, where the cannonical version of Open Context is running. In order to derefernce these identifiers using this current host, substitute references to http://opencontext.org/ with the current host https://staging.opencontext.org/.

Icon Credits
Gears icon by Jeremy Minnick via the NounProject.com

Getting JSON(-LD) Data

Open Context offers two main ways to get JSON-LD data. The better-practice approach relies upon HTTP content negotiation. The second expedient approach involves adding a .json to request URLs. The table below provides examples for using these different approaches.

Approach Description and Examples
Content Negotiation

Open Context supports content resolution for both item record and search APIs. To request JSON data, include either of the following in the HTTP header of your request:

  • Accept: application/json (Usually also valid as JSON-LD)
  • Accept: application/vnd.geo+json (Usually also valid as JSON-LD)
  • Accept: application/ld+json (Strict JSON-LD)
Modifying Item Record URLs

For individual record items, you can get JSON data by simply adding a ".json" to the end of a record's URL. For example, you can get the JSON for this object: https://staging.opencontext.org/subjects/c4f88b9c-aee0-430a-baad-083f5dfda8fd, with this URL: https://staging.opencontext.org/subjects/c4f88b9c-aee0-430a-baad-083f5dfda8fd.json.

Because JSON-LD has some constraints that make it unsuitable to express complex geospatial features (see note below), Open Context allows you to specifically request strict JSON-LD either through content negotiation or by appending .jsonld to item URLs.

Modifying Query URLs

Similarly, adding a ".json" to search URLs also requests JSON data. To do so, add the ".json" just before the quesion-mark ("?") in a search URL. If the URL doesn't have a question-mark, append the ".json" at the end of the URL.

  1. Query without a "?" in the URL

    The search URL https://staging.opencontext.org/query/Asia/Turkey/Kenan+Tepe becomes https://staging.opencontext.org/query/Asia/Turkey/Kenan+Tepe.json for a JSON request.

  2. Query with a "?" in the URL

    The search URL https://staging.opencontext.org/query/?proj=24-murlo becomes https://staging.opencontext.org/query/.json?proj=24-murlo for a JSON request.

Modifying Query URLs with JSONP

Finally, while Open Context supports CORS (Cross Origin Resource Sharing), in practice some clients may still have trouble accessing JSON data. For these clients, Open Context APIs support JSONP (JSON with "padding") for URLs with a ".json" added as described above. To request JSONP, add a "callback" parameter to your request, as in the examples below:

  1. Example (Search/Query Request):

    The plain JSON(-LD) request URL https://staging.opencontext.org/query/Asia/Turkey/Kenan+Tepe.json can be made as a JSONP request like: https://staging.opencontext.org/query/Asia/Turkey/Kenan+Tepe.json?callback=myFunction

  2. Example (Item Request):

    The plain JSON(-LD) request URL https://staging.opencontext.org/projects/3585b372-8d2d-436c-9a4c-b5c10fce3ccd.json can be made as a JSONP request like: https://staging.opencontext.org/projects/3585b372-8d2d-436c-9a4c-b5c10fce3ccd.json?callback=oneDoesNotSimply

Other Linked Data Representations

Open Context provides preliminary / experimental support for other RDF (linked data) representations via content negotiation or by appending extensions to the URLs of item records. These now include:

  • Accept: application/n-triples, or extension: .nt
  • Accept: application/rdf+xml, or extension: .rdf
  • Accept: text/turtle, or extension: .ttl

Permissions, Request Headers, and Example

Web bots (crawlers, spiders) constantly make requests to Open Context. Some of these bots serve malicious purposes. To avoid getting overwhelmed by bots, Open Context has fairly strict controls to intercept suspicious requests.

Open Context will often block API requests unless you specifically add an allowed User-Agent to the HTTP header of your request. A good user agent would be: 'User-Agent': 'oc-api-client'. Some example Python code that includes this User-Agent in making requests to Open Context's APIs can be found in this repository: https://github.com/ekansa/open-context-jupyter/blob/release/opencontext/api.py.

GeoJSON-LD: Geospatial Data with Time

Most of the information returned by Open Context's APIs, except for data about concepts in controlled vocabularies ("types") and descriptive or linking relations ("predicates") contain geospatial data. To make the geospatial information easier to use, Open Context expresses geospatial data using the GeoJSON standard. GeoJSON is widely supported by Web mapping services and visualization tools and by desktop GIS software.

Open Context has adopted GeoJSON-LD conventions documented here. One component added to the GeoJSON-LD features is a "when" object (see current proposals and earlier discussion). The when object adds information on the chronological coverage of a GeoJSON feature. Time intervals are defined by ISO 8601 defined string values in the start and stop limits. Dates BCE are indicated as negative values (with 1 BCE noted as "0000", hence values that look somewhat odd). Importantly, each GeoJSON feature with a when object should be considered an "event" that has a spatial as well as a chronological coverage. Below we provide an example of a when object that provides chronological coverage information:

  • "when": {
    • "id": "#event-when-2",
    • "type": "Interval",
    • "start": "-8049",
    • "stop": "-6049",
    • "reference_type": "specified"
  • }

Open Context provides three main varieties of GeoJSON-LD data described with examples in the table below.

Variant Description and Examples
Item Record GeoJSON-LD

In Open Context, some items have multiple GeoJSON-LD events. For example, an archaeological site can have multiple episodes of occupation. So a single Open Context record can have multiple GeoJSON-LD features (see example: HTML, GeoJSON-LD).

Query Result Record GeoJSON-LD

The Open Context Query API can return GeoJSON-LD features that represent a search result. To simplify matters, only one feature is returned per search result. A "when" object in a feature reflects the item's total time coverage unless the query specified limits on time. If the query limits by time spans, the when object provides a chronology information within the constraints of the query.

By default, the Query API returns GeoJSON-LD features of search results. Search result records come as point features. Two attributes in the JSON data differentiate result record features from facet-region features (see below). These are:

  1. Features with an attribute: "category": "oc-api:geo-record"
  2. Feature properties with an attribute: "feature-type": "item record"

To limit the GeoJSON-LD response to result records, add a response=geo-record parameter to the query URL (see example).

Open Context returns a few common attributes for result records in the GeoJSON "properties" object. Open Context will add additional attributes to the properties object depending on query filters in the request. For instance, the following search requests animal bones from Ecuador that have a biological taxonomy identification: https://staging.opencontext.org/query/Americas/Ecuador?type=subjects&cat=oc-gen-cat-bio-subj-ecofact---oc-gen-cat-animal-bone&prop=obo-foodon-00001303. GeoJSON result records from that query will have a taxonomic attribute in their properties as shown below:

  • "properties": {
    • "id": "#rec-1-of-244200",
    • "feature-type": "item record",
    • "uri": "http://opencontext.org/subjects/72b1adc2-d085-4425-a402-16eab349d8f6",
    • ... more attributes ...
    • "Has taxonomic identifier": ["Lycalopex"]
  • }

You can request more attributes to include in the properties of a GeoJSON result record feature. To do so, add the attributes parameter with a comma separated list of descriptive predicates, identified by slug (slugs are included in JSON-LD and are easy to read short-hand identifiers specific to Open Context) or by URI.

  1. Animal bones in Ecuador with anatomical identification attribute data: attributes=oc-zoo-has-anat-id
  2. Pottery from the Murlo project in Italy with vessel-form attribute data: attributes=24-vessel-form

If you don't know the slugs for attributes you want describing your GeoJSON records, you can request ALL-PROJECT (for all project defined attributes available for an item), ALL-STANDARD-LD (for all "standard" attributes that may be used in multiple projects), or ALL-ATTRIBUTES (for both all project defined and all "standard" attributes).

  1. Pottery from the Murlo project in Italy with all project defined attribute data: ALL-PROJECT
  2. Pottery from the Murlo project in Italy with all standard attributes used across multiple projects: ALL-STANDARD-LD
  3. Pottery from the Murlo project in Italy with all available attributes (both project defined and standard): ALL-ATTRIBUTES

Finally, some clients (especially desktop GIS applications) do a bad job at handling multiple values for property attributes. To work around this problem, you can add the parameter flatten-attributes=1 to the URL requesting GeoJSON records.

Query Region Facet GeoJSON-LD

Open Context uses a tiling algorithm to hierarchically index geospatial data. This approach allows Open Context to aggregate data at different scales by location. Open Context returns aggregated data as "region facets" that can be visualized in a map interface as square polygons. By default the query API returns region facets. Two attributes in the JSON data differentiate result record features (see above) from facet-region features. These are:

  1. Features with an attribute: "category": "oc-api:geo-facet"
  2. Feature properties with an attribute: "feature-type": "discovery region (facet)"

To limit the GeoJSON-LD response to facet regions, add a response=geo-facet parameter to the query URL (see example).

Last, you can change the level of aggregation in the facet regions with the geodeep parameter. The maximum value is 20, corresponding to a Web mapping zoom level of 20. Below are two examples with different levels of aggregation:

  1. More aggregated (coarse) data at geodeep=7, presented in: HTML (with map visualization) or GeoJSON-LD
  2. Less aggregated (fine-grain) data at geodeep=11, presented in: HTML (with map visualization) or GeoJSON-LD

Query API: General Syntax Notes

The Open Context API returns URLs with some descriptive information on how to query and filter. URLs in the id key and json keys represent requests for different filters in HTML (unless one requested otherwise with HTTP content negotiation) and JSON.

To make it easier to debug and understand, Open Context uses "slugs" to identify different predicates (descriptive properties and linking relations) and objects (mainly controlled vocabulary concepts) in a query. The slug keys in the Item Record and the Query JSON-LD APIs provide slugs that you can use in a query.

In the Open Context search API, the main pattern to composing queries works as follows:

  1. A descriptive property predicate by itself filters for all records that have that descriptive property. In the example below, the search returns all records described by the predicate "Has taxonomic identifier" (with that predicate identified by a slug):

    prop=obo-foodon-00001303

  2. Instead of using slugs you can also use URL-escaped URIs to identify concepts to use in a query. The query API provides URIs that identify concepts with the rdfs:isDefinedBy key (Caveat: concepts in the oc-api and oc-gen namespaces won't resolve yet, so don't reference them in this way for the time being). Here is the same query as above using a URI rather than a slug to identify the predicate "Has taxonomic identifier".

    prop=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FFOODON_00001303

  3. To search for a specific value used with a given descriptive property, append a --- delimiter after the slug for the descriptive property and follow with the slug (or URI) identifier for that value. In the example below, the search returns all records described by the predicate "Has taxonomic identifier" and values identified with gbif-359 for "Mammalia":

    prop=obo-foodon-00001303---gbif-359

  4. In the example above, Open Context has the object value gbif-359 for "Mammalia" in a hierarchy of classifications. It automatically returns results that include mammals and all subordinate (more specific) taxa. You can further restrict the scope of your query to more specific taxa by appending a --- delimiter after the gbif-359 and then adding a slug for a more specific type of mammal. In the example below, the search returns all records described by the predicate "Has taxonomic identifier" and values identified with gbif-359 for "Mammalia" and "Carnivora" (gbif-732):

    prop=obo-foodon-00001303---gbif-359---gbif-732

  5. You don't need to know the position of a concept in a hierarchy in order to search for it. The following query returns identical results for Carnivores as the one above, but omits reference to Mammalia:

    prop=obo-foodon-00001303---gbif-732

  6. Open Context uses double pipe characters ("||") for Boolean "OR" terms. The following example returns results described as either carnivore gbif-732 or lagomorph (rabbits, hares) gbif-785:

    prop=obo-foodon-00001303---gbif-732||gbif-785


Specialized Queries

In addition to the general querying patterns discussed above, Open Context has some more specialized query options, as described below:

Parameter Definition and Examples
q={URL encoded search term}

Adding this query parameter with a URL encoded search term requests a full-text search. This works similarly to familiar text searches elsewhere on the Web, and can be combined with other search parameters. Here is an example text search for the term "bucchero".

obj={URL encoded URI}

Adding this query parameter with a URL encoded URI requests records that link (via any attribute or relation predicate) to a URI-indentified entity defined by a data source, vocabulary or ontology outside of Open Context.

For example, this search, https://staging.opencontext.org/query/?obj=http%3A%2F%2Fpurl.obolibrary.org%2Fobo%2FUBERON_0000979, finds all records associated with the UBERON concept for the bone element tibia, identified by this URI: http://purl.obolibrary.org/obo/UBERON_0000979. This example works similarly to a SPARQL query formulated like:

SELECT ?s
{
?s ?p <http://purl.obolibrary.org/obo/UBERON_0000979>
}
bbox

This query parameter requests a filter defined by a geospatial bounding box for the location of discovery or description. The bounding box is expressed as comma sperated values of coordinates using the WGS84 (typical for Web mapping) decimal degrees. The order of coordinates follows the GeoJSON pattern of (x,y) (Longitude, Latitude). The first pair of coordinates defines the lower-left (south-west) corner of the bounding box, with the second pair of coordinates defines the top-right (north-east) corner. An invalid bounding box will be ignored as a search filter but will return a notice in the oc-api:active-filters of the JSON-LD response.

Bounding Box search examples include:

  1. Filtering for a region in the south-east of the United States: -87.19 Longitude, 21.29 Latitude (south-west corner) to -73.48 Longitude, 34.60 Latidude (north-east corner)
  2. You can use the double pipe characters ("||") for Boolean "OR" searches of multiple regions, as in this example: 31.37,C34.31,35.42,C35.96 (roughly Cyprus) OR 31.86,35.89,34.53,36.97 (roughly Cilicia in Turkey)
allevent-start={Integer year BCE/CE}
allevent-stop={Integer year BCE/CE}

These parameters filter a request by the general date ranges of the formation, use, and/or life of items described in the search results. The allevent-start parameter defines the earliest date to include in search results, while the allevent-stop defines the latest date to include in search results. With these parameters, use integer dates (negative values for dates BCE) and positive values for dates CE.

Some items, such as archaeological sites, may have multiple date ranges (a site may have multiple episodes of occupation and use). In these cases, Open Context's query results will include records that have date ranges falling within the allevent-start and allevent-stop limits, even if the records also have date ranges that fall outside of these limits.

Date range search examples include:

  1. Filtering for sites in the United States that have occupation / use components that date before 7500 BCE: allevent-start=-7500
  2. Filtering for cattle bones dating between 15000 BCE and 4000 BCE: allevent-start=-12000&allevent-stop=-4000
linked=dinaa-cross-ref

Adding this query parameter and value adds a special filter for interacting with data in the Digital Index of North American Archaeology (DINAA) project. The query filters for DINAA records cross-referenced with URI identified resources curated by other online collections. Click here to view cross-referenced records in DINAA.

id={Identifier string}

Adding this query parameter and value filters records for different types of identifiers, including DOIs, ARKs, ORCIDs, and Open Context URIs and UUIDs. Open Context will evaluate query with different variants of expressing an identifer. For example, the identifier "doi:10.6078/M77P8W98" can be expressed as:

A search for any one of the above ID varients will retrieve the same record.

Query API: Response Options, Metadata, and Paging

By default the Open Context API returns a wide variety information in response to a query request. The responses include, metadata, a variety of search facets, geo-spatial data (in the form of regional facets and result records) and some non-geospatial results. You can limit the types of responses you get from Open Context searches by adding the response parameter and a comma separated list of response types to a request URL. For example, the https://staging.opencontext.org/query/Americas/United+States.json?response=metadata request only gets metadata about a search result, while https://staging.opencontext.org/query/Americas/United+States.json?response=metadata,uri gets metadata and URIs of result records. Allowed response types are:

  • metadata: metadata about a search, includes paging information, number of records found, etc.
  • facet: search facets that ARE NOT GeoJSON region facets (see the GeoJSON discussion) and ARE NOT chronological span facets
  • chrono-facet: search facets for chronological spans (see the Query Facets discussion)
  • geo-facet: search facets that ARE GeoJSON region facets (see the GeoJSON discussion)
  • geo-record: result records as GeoJSON features (see the GeoJSON discussion)
  • uri-meta: result records as a simple list of URIs and some attributes
  • uri: result records as a simple list of URIs
  • uuid: result records as a simple list of UUIDs
  • solr: raw and unprocessed Solr query response (these will be very difficult to process given the high-level of abstraction in Open Context's Solr schema)

The table below introduces some of the general metadata and linking options (especially paging) provided by the Query API:

Attribute Definition and Examples
totalResults

Total number of results found in the search.

start
stop
dc-terms:temporal

The Query API returns summary time span information for the set of results obtained from a query. Time intervals describe general date ranges relevant to the formation, use, and/or life of items described in the search results. The API expresses the date information as ISO 8601 defined string values in the start and stop limits (the same kind of expression used in GeoJSON "when" objects). The dc-terms:temporal key provides the same time-span information in a manner consistent with Pelagios project (see documentation here) patterns.

Paging

The Query API returns links to page through the list of query result records. Paging links are provided for both HTML and JSON versions as follows:

  1. first, first page
  2. previous, previous page
  3. next, next page
  4. last, last page

In addition to paging, the Query API provides details about the start of the index number of the current page's search results. "startIndex": 0 is the beginning of the list of results. As one pages through the search results, the startIndex increases. You can change the number of results returned per page with the rows parameter. Open Context defaults to 20 records per page and this can be increased to 1000 with rows=1000.

oc-api:active-filters

The Query API provides a list of the filters currently constraining the the search results. The API provides some descriptive information for each filter, including links indicated as oc-api:remove or oc-api:remove-json that can be followed to remove that particular filtering constraint.

oc-api:has-text-search

The Query API provides a list of ways one can add an additional full-text search constraint to any search constraints already in place. The API provides a template for composing URLs to request full-text searches in the oc-api:template or oc-api:template-json keys. Substitute the {SearchTerm} in those URLs with the URL encoded text of your search term(s).

Query API: Facets

Facets represent one of the most important and useful types of information provided by the Query API. Open Context's query API provides different kinds of facets as follows:

  • Geospatial region facets as GeoJSON features (see GeoJSON discussion)
  • Chronological (time-span) facets
  • Range facets (for numerical or date ranges) and numeric searches
  • Classification facets for searching by controlled vocabulary concepts

Each type of facet has a list of query options that can further constrain your search. The id attribute-key provides a link to the HTML (if not using content negotiation) search option, and the json attribute-key provides a link to the JSON representation of the search option. Finally, the count attribute-key provides a count of the frequency that facet option appears in the filtered set of data.

Type of Facet Definition and Examples
oc-api:has-event-time-ranges

These are hierarchically organized time-spans for dates in the past. Since the algorithm (described here) used to compute these facets needs a fixed reference point, the latest possible date in these time-spans is the year 2000 (CE). The start and stop attribute keys describe dates in years BCE / CE, with BCE dates as negative values. In way that's similar to the geo-region facets described above, you can control the level of chronological aggregation to your requests by adding the chronodeep query parameter to request URLs.

  1. More aggregated (coarse) chronology data at chronodeep=16, the default level, as JSON-LD
  2. Less aggregated (fine-grain) chronology data at chronodeep=32, the maximum precision, as JSON-LD

Usually, the facet count will be the same as the number of records returned if you execute the search option, but that is not always the case for chronological facets. Items in Open Context (especially archaeological sites) can have multiple time spans of use. A chronological facet count reflects the number of times a particular time-span bucket is filled. A single record of an archaeological site can fill multiple time-span buckets. This sometimes makes facet counts for time-spans a close approximation rather than an exact match of filtering by a particular time-span.

oc-api:has-facets

These are hierarchically organized descriptive predicates and controlled vocabulary concepts that Open Context exposes to guide searches. Diferent metadata facets have some descriptive information and come with lists of options that can be used to further filter the search. There are four types of option lists for faceted searching:

  1. oc-api:has-id-options: options in this list are concepts or other named entities identified by URIs. The rdfs:isDefinedBy predicate indicates the URI for the concept defining the particular search filter.
  2. oc-api:has-boolean-options: these options describe boolean attributes.
  3. oc-api:has-integer-options: these options describe integer attributes for records. Selecting one of these options will request integer numeric range data associated with the option.
  4. oc-api:has-float-options: these options describe floating point (decimal) attributes for records. Selecting one of these options will request numeric range data associated with the option.
  5. oc-api:has-date-options: these options describe records with (recent) calendric date information. Selecting one of these options will request calendric range data associated with the option.
  6. oc-api:has-text-options: these options include fields that describe records with unstructured text (not controlled vocabularies). Selecting one of these options will request a full-text search template for this field.
oc-api:has-numeric-facets

These facets summarize the current set of search results according to a given numeric property. The minimum value (oc-api:min) and the maximum value (oc-api:max) give the total range of values for the records defined by the current search filters. In addition, the oc-api:has-range-options list quantifies the number of records in different numeric ranges. These can be visualized to make histograms of summarizing all of the records for a search according to this numeric property. The example below describes two numeric range options:

Open Context simply uses Solr query syntax for numeric queries. In the first example above, the query term for the numeric range filter is [3.79 TO 10.850999999999999] (URL-encoded). You can substitute that term with any valid Solr numeric query.

oc-api:has-date-facets

These facets summarize the current set of search results according to a given calendric property. They work in a very similar way to the numeric facets described above. One can also use valid Solr syntax to query against calendric properties.