Karstlink how to

Introduction

Semantic Web (SW) is more than a set of data formats. If you are accustomed to a classical data format like spreadsheets and CSV, JSON, XML, or YAML, you are used to data in isolation. The Semantic Web is designed to facitate links between individual documents. The key feature is the use of global identifiers instead of local ones like database keys, XML ID's, etc. This enables to refer and thus enrich to data available elsewhere.

These global identifiers are generally HTTP URL's . When sharing data whith the SW, the good practice is to use a URL prefix starting with your organization's URL; this way one is sure that no other organization will use these identifiers. For example, suppose your organization has web site https://myorg.org/ , and you have an SQL table for persons that you want to share, the identifier will be https://myorg.org/persons/111 for person number 111 . It could be https://myorg.org/data/persons/111 , or https://data.myorg.org/persons/111 , you can choose any scheme you want . These URL's can later provide RDF data in one of the formats below, but this is not mandatory.

In SW, the atomic data is a "triple" subject - property - value. Subject and property are URL's (more generally URI, IRI's). Depending on the data export or dump that you may already have, the concrete formats can be one of the following (see below):

You may send us a sample of your data or data structure (e.g. SQL DDL) in any format easily obtained, so that the technical discussion can start on concrete things.

The data site, that aggregates speleology data is here: data.grottocenter.org (at the moment 4 datasets).

Share data with a .csv file

You can put your data in a CSV file. This file must respect one of the models given on the example page . Please indicate on the wiki the existence of this CSV file. The files are different for each data type (cavity, area, ...), but there are some common elements. They are all presented in the table below

When we speak of "resource" in the table, it is the information available in a cell of the table. If an external resource is specified, the license for that document is not known.

If in a field you have several values, separate them with a |. For example "Peter | Eric"

ColumnexplanationsmandatoryData type
idUnique identifier, if possible the one that the resource has in your databaseyesstring
rdf:typeThe type of resource: document, underground cavity, bio-speleo observation. Look at the ontology and the examples to find the list of labelsyesURI or string
dct:rights/karstlink:licenceTypeIndicate the license attached to the resource from the values present in the ontologyyesURI or string
dct:rights/dct:createdResource creation datenodate AAAA-MM-JJ
dct:rights/dct:modifiedResource modification date. This information is useful for managing the updating of data on the SPARQL servers that collect the data.nodate AAAA-MM-JJ
dct:rights/cc:attributionURLURL that points to the organization or person (an agent) making the resource available. This agent plays a publisher role and has the rights allowing him to make the resource available under the specified licenseyesURI
dct:rights/cc:attributionNameThe name the creator of a Work would like used when attributing re-use.yesString
karstlink:documentTypeThe type of document: collection, number, article, database. Choose among the types offered by the ontologyyesURI or String
rdfs:labelthe name of the resourceyesstring
dct:subjectThe subject of the resource. Choose among the subjects appearing in the ontologynoString or URI
dc:langageThe primary language used in the resourcenocontrolled vocabulary ISO 639-2
gn:countryCodeThe country linked to the resourcenoiso country code, 2 characters
dct:dateUse to indicate the date on which the document that is described in the resource was publishednoString
dct:formatFile format that is described in the resourcenouse the list of Internet Media Types [MIME]
dct:identifierAn unambiguous reference to the resourcenoISBN, ISSN, DOI, URI
dct:sourceA link to the file described in the resourcenotURI
dct:creatorthe author of the document described in the resourceyesString or URI
dct:publisherThe organization that published the document described in the resourcenoString or URI
dct:isPartOfIndicates that the document described in the resource is part of another document. For example an article is part of a journal, an issue is part of a collectionnoAn URI is better
dct:referencesLink to another resource which is related to the document described in the resource: Point, Observation, Area, OrganizationnoAn URI is better
karstlink:relatedToUndergoundCavityLink to Underground cavity which is related to the document described in the resourcenoAn URI is better
karstlink:hasDescriptionDocument/dct:creatorauthor of the resource descriptionnoString or URI
karstlink:hasDescriptionDocument/dc:languagelanguage of the resource descriptionnocontrolled vocabulary ISO 639-2
karstlink:hasDescriptionDocument/dct:titleresource description titlenoString
karstlink:hasDescriptionDocument/dct:descriptionResource DescriptionnoString
gn:alternateNameUsed to indicate another name when the resource is a underground cavitynostring or URI
schema:containedInPlaceUsed to indicate when the resource is an underground cavity, that it is part of another underground cavity (a network)noURI is better
w3geo:latitude The WGS84 latitude of the resourceyesdecimal degrees
w3geo:longitudeThe WGS84 longitude of the resourceyesdecimal degrees
w3geo:altitudeThe altitude of the resourcenointeger
dwc:coordinatePrecision A decimal representation of the precision of the coordinates given in the Latitude and Longitude. Use -1 for falses coordinatenointeger
karstlink:length The real development of all the galleries of the underground cavitynointeger
karstlink:verticalExtentThe vertical distance between the entrance to the underground cavity and the highest pointnointeger
karstlink:extentBelowEntranceThe vertical distance between the entrance of the cavity and the lowest pointnointeger
karstlink:extentAboveEntranceThe vertical distance between the highest point and the lowest point of the underground cavitynointeger
karstlink:discoveredByAgent (person or organization) who discovered the underground cavitynoURI or String
karstlink:hasAccessDocument/dct:creatorauthor of the description of the access to the resourcenoURI or String
karstlink:hasAccessDocument/dc:languageLanguage of the description of the access to the resourcecontrolled vocabulary ISO 639-2
karstlink:hasAccessDocument/dct:titleTitle of the description of the access to the resourcenoString
karstlink:hasAccessDocument/dct:descriptionDescription of access to the resourcenoString
dwc:recordedByPerson who carried out the observation
dwc:eventDateDate of observation
dwc:identifiedByPerson who determined the Taxon
dwc:dateIdentifieddate of Taxon determination
dwc:individualCountNumber of individuals of a taxon
dwc:associatedTaxaTaxon nameyesString or URI
karstlink:relatedToUndergroundCavityUnderground cavity connected to the resourceno
dct:spatialLocation of the observation (link to a point or area, not to a underground cavity)noString or URI
foaf:firstNameFirst name of the person described in the resourceyesString or URI
foaf:lastNameLast name of the person described in the resourceyesString or URI
foaf:nickNickname of the person described in the resourcenoString or URI
foaf:memberLink to an organization of which the person or organization described in the resource is a membernoString or URI
karstlink:visitedLink to a cavity visited by the organization or the person described in the resourcenoString or URI
karstlink:pointTypeUsed to indicate the type of resource. Choose among the types of points existing in the ontologynoString or URI
foaf:mboxAn Internet mailbox.nomail or URI
foaf:homepage A Homepage is a public Web document with an URI.noURI
schema:streetAddressThe street addressnoString
schema:postalCodeThe postal codeno
schema:addressLocalityThe locality in which the street address isnoString
schema:addressCountry The country codenothe two-letter ISO 3166-1 alpha-2 country code
karstlink:areaTypeUsed to indicate the type of resource. Choose among the types ofareas existing in the ontologynoURI or String
schema:polygonA polygon is the area enclosed by a point-to-point path for which the starting and ending points are the same. A polygon is expressed as a series of four or more space delimited points where the first and final points are identical.no

Share data with RDF dump

The different concrete formats (syntaxes) for RDF are the following (see RDF Serialization_formats ) If you already have a JSON dump, it is possible to upgrade it as JSON-LD, without impacting the current users (see paragraph below). If you already have an XML dump, it can be the starting point for a RDF/XML dump, without too much structural change. If you have no dump or API whatsoever, and do not want to go the CSV way, there are 2 ways: N-Triples and direct mapping from SQL using R2RML.

N-Triples is easy to generate, because it presents any data in the most atomic way, in line with the basics of semantic web:

Here is a line exemple:
<http://myssite.org/person/111> <http://www.w3.org/2000/01/rdf-schema#label> "Frédéric Urien" . 

If you have an SQL database, a so-called direct mapping from SQL is possible by using dedicated tools. There is even a W3C standard for mapping an SQL database to RDF, R2RML . Leveraging this mapping language, several tools can either generate an RDF dump, or even provide a SPARQL server that wraps the source SQL database as a SPARQL database.

Turtle is widely used in the Semantic Web; it is based on N-Triples, meaning that every N-Triples document is a Turtle document. It comes with several syntactic features to make it more human readable and shorter, and hence much more complex. So I would not recommend to use it for data sharing from scratch.

Share data with a Json-LD API

If you already have a JSON API, it is possible to use the JSON-LD technology, to extends the existing JSON returned by the API, without impacting the current users. JSON-LD provides a way to interpret a key-value pair in the original data as an RDF triple. It can work even with no modification at all in the JSON API. But it works best by adding a few dedicated special keys: @id, @type, @context .

An @context is a special declaration JSON document that describes the mapping. We can write one for you. Here are examples of @context JSON documents recently writen for Kartslink:

And here is an example of a JSON API augmented with the JSON-LD special keys: https://beta.grottocenter.org/api/v1/massifs/555 . It should be noted that JSON-LD documents are 100% syntactically correct JSON documents.