Commons:Structured data/Modeling
This page contains an overview of how to model information (metadata) about files on Wikimedia Commons in Structured Data.
The basics: structured data for every Commons file
[edit]The following structured data is relevant for every file on Wikimedia Commons. This structured data roughly corresponds with the information stored in the Information template, a general usage infobox template to describe files in wikitext.
Structured data to add | Brief instructions | In-depth instructions info about the data model in structured data |
---|---|---|
File caption(s) (multilingual) | A (short) textual description of the file, in at least one language. Plain text; no Wiki markup or hyperlinks. | Data modeling guidelines: File captions |
Date | Usually the date when the file was created; using a inception (P571) statement. | Data modeling guidelines: Date |
Source of the file | Information about where the file was taken from. Is it the uploader's own work, was it uploaded from an external website,...? Typically using a source of file (P7482) statement. | Data modeling guidelines: Source of the file |
Creator | Who created the file? Typically described with a creator (P170) statement. | Data modeling guidelines: Creator of the file |
Copyright status and license | Is the file still under copyright, or is it public domain? If still under copyright, which license(s) applies/apply? Using copyright status (P6216) and copyright license (P275). | Data modeling guidelines: Copyright and licenses |
If the above structured data is added to a file, the file's wikitext description can be simplified as follows:
File (click to explore how it is described) | Wikitext | Main structured data |
---|---|---|
== {{int:filedesc}} == {{Information}} == {{int:license-header}} == {{self|cc-by-sa-4.0}} [[Category:Energica Ego]] |
|
An overview of further structured data property statements, that are in active use can be found here: Commons:Structured data/Properties table
The specifics: case examples of common Commons files
[edit]Own work upload directly to Commons
[edit]To describe a simple {{Own}} work upload directly uploaded by the author or {{Self}}-licensed by the uploader:
- File caption: one or more short description(s) of the file + language
- Date: inception (P571), see Commons:Structured data/Modeling/Date
- Source of the file: source of file (P7482) → original creation by uploader (Q66458942)
- Creator of the file: creator (P170) → "some value" to indicate the creator doesn't have a Wikidata item. Qualified with:
- object has role (P3831) → photographer (Q33231) to indicate we're talking about the photographer here, if it is a picture and not a video or audio file
- author name string (P2093) → "<some name>" to indicate what name should be shown. Usually a username or a real name
- Wikimedia username (P4174) → "<some username>" to indicate the contributing user
- URL (P2699) → "https://commons.wikimedia.org/wiki/User:<some username>" to link back to the user page of the contributing user, if it exists
- Copyright and licenses: copyright license (P275) and copyright status (P6216), see Commons:Structured data/Modeling/Copyright
Upload from a platform like Panoramio, Geograph or Flickr
[edit]To describe an upload directly uploaded from a platform: (Preferably all uploads were done by a tool or bot, for consistency)
- File caption: one or more short description(s) of the file + language
- Date: inception (P571), see Commons:Structured data/Modeling/Date
- Source of the file: source of file (P7482) → file available on the internet (Q74228490) to indicate the source
- operator (P137) → e.g. Panoramio (Q239516) to indicate the platform
- described at URL (P973) → "<some url>" to indicate the location
- Creator of the file: creator (P170) → "some value" to indicate the creator doesn't have a Wikidata item. Qualified with:
- object has role (P3831) → photographer (Q33231) to indicate we're talking about the photographer here, if it is a picture and not a video or audio file
- Flickr user ID (P3267) → "<flickr/... user number>" to indicate the Flickr user identifier (number), if applicable
- author name string (P2093) → "<some name>" to indicate what name should be shown. Usually a username or a real name on the platform
- URL (P2699) → "<some url>" to indicate the URL of the page where the file is located
- Copyright and licenses: copyright license (P275) and copyright status (P6216), see Commons:Structured data/Modeling/Copyright
For Flickr uploads please also see Commons:Flickypedia/Data Modeling
Pronunciation
[edit]- Copyright and licenses: copyright license (P275) and copyright status (P6216), see Commons:Structured data/Modeling/Copyright
- Type: instance of (P31) → pronunciation file (Q108167708)
- Language: language of work or name (P407) → e.g. French (Q150)
- Transcription: audio transcription (P9533) → "<verbatim>" to describe what is pronounced
- Recording date: recording date (P10135)
- Who recorded it: recordist (P10893)
- Who pronounced it: spoken by (P10894)
- IDs: e.g. Lingua Libre ID (P10369) → "<id>" to describe the source identifier if applicable
How to model more specific types of files
[edit]- Visual artworks - work has a Wikidata item Work in progress
- Works without a Wikidata item Work in progress
- Maps Work in progress
- Illustrations Work in progress
- Conference talks Work in progress
How to model specific types of metadata
[edit]Here, we look at specific types of metadata for a file:
- Depiction and Digital representation of and Main subject Work in progress
- Date Work in progress
- Author and Creator Work in progress
- Source Work in progress
- Copyright and Licensing Work in progress
- Metadata Work in progress
- Location Work in progress
- Participants and Sponsors Work in progress
- Quality and Maintenance Work in progress
- Significant Event Work in progress
- Image captured with Work in progress
GLAM
[edit]In some cases, large-scale content contributions mainly originating from Galleries, Libraries, Archives, and Museums (GLAM) use more specific data models.
It is highly recommended that all file metadata also complies with the general, basic data modeling recommendations listed above. This will make sure that all data on Wikimedia Commons can be uniformly searched and queried across the entire platform.
Content specific properties may be added, like:
- The Metropolitan Museum of Art: The Met object ID (P3634) → "<description>"
- iNaturalist: iNaturalist observation ID (P5683) → "<id>"
- Digital Public Library of America → Please see: Commons:Digital Public Library of America/Modeling
- Biodiversity Heritage Library → Please see: Commons:Biodiversity Heritage Library/Modeling
Bots
[edit]Some bots automatically populate SDC data based on metadata in Commons templates.
- BotMultichill adds properties for various IDs.
- BotMultichillT populates date, coordinates, camera, source, copyright and author.
- SchlurcherBot adds various types of SDC to files per the modelling described here.
- JarektBot adds Wikimedia VRTS ticket number (P6305) and digital representation of (P6243) using QuickStatements.
- AliciaFagervingWMSE-bot uploads creator (P170), inception (P571), coordinates of the point of view (P1259), depicts (P180), and participant in (P1344) to Wiki Loves Monuments files from Sweden, Israel, and Poland.
- METbot adds The Met object ID (P3634) and collection (P195) to the Metropolitan Museum of Art files.
- GeographBot uploads different SDC information to Geograph Britain and Ireland files.
- DPLA bot added structured data claims to DPLA items.
- XRayBot adds/updates coordinates of depicted place (P9149) (and others) - but only to XRay's own photographs.
- NikkiBot adds structured data to Lingua Libre uploads.
- FlickypediaBackfillrBot adds structured data to Flickr files.
- Emijrpbot adds structured data for various camera properties from Exif data.
General remarks
[edit]- What should be the general order of statements in the structured data statements tab? The community can indicate and change this order at MediaWiki:Wikibase-SortedProperties. (See the equivalent page on Wikidata)