Con
Designing models independently from how they will be used sounds like a recipe for disaster. The idea that concepts like ‘actor’ or ‘movie’ have universal descriptions independent from how the concepts are utilized seems deeply flawed to me.
https://plato.stanford.edu/entries/wittgenstein/#MeanUse
Examples
https://netflixtechblog.com/uda-unified-data-architecture-6a6aee261d8d
https://github.com/CategoricalData/hydra done by Joshua after https://www.uber.com/blog/dragon-schema-integration-at-uber-scale/
Pro
It’s been so long since the Semantic web and RDF and OWL and SKOS. I’m so glad they stuck with W3C and didn’t reinvent those wheels. Will this UDA approach catch on? I don’t know, but I hope so. It seems like it is trying to move the frontier of the difficulties of applying Domain Driven Design and semantic concepts to an enterprise company of significant scale.
If we can get compound interest across development teams by giving them a common toolset and skillset that covers different applications but the same data semantics, maybe not every data contract will have to be reduced to DTOs that can be POSTed or otherwise forced to be a least common denominator just so it can fit past a network or other IPC barrier.
https://berezovskyi.me/ is a fount of knowledge and passion
I hope this or a similar initiative will go big, because the real benefits of such approaches only materialize when you need to connect more than 3 systems with differing information models but also when there is enough uptake in the market and tools come with such APIs out of the box. For example, OSLC (disclosure: I am involved with the project), gets attention when someone tries to connect IBM Jazz and Siemens Polarion that come with OSLC out of the box - there is far less interest to go and create OSLC APIs yourself for systems/tools you wish to integrate even for cases where the benefits are present in the same way as for Jazz/Polarion.
Upper is the metamodel for Connected Data in UDA — the model for all models. It is designed as a bootstrapping upper ontology, which means that Upper is self-referencing, because it models itself as a domain model; self-describing, because it defines the very concept of a domain model; and self-validating, because it conforms to its own model.
Are we certain this means something?
I am somewhat certain it does. BFO book or lecture may need to be consumed and understood before this fully makes sense, unfortunately. Upper ontologies are quite useful to help prevent grave information modelling mistakes (cf. BFO’s dependents vs continuants). This is somewhat helpful only if you decide to forego ontological nominalism (“these things will be called ducks because I said so”) and instead adopt ontological realism (“these things look similar and quack, therefore we will call them ducks”).
I, however, have two nits about the statement you highlighted:
- RDF itself is homoiconic, because models of RDF data are themselves RDF (in a similar way to Lisp). To see an example, compare a model in Section 4 with the actual data in Section 2.1 in the RDF Primer. In other words, you can start using RDF today and get some of those properties from day one.
- At some level (deep enough - and upper ontology is the last stop), recursive/self-referencing definitions are actually undesired because you are trying to avoid logical errors (fallacies) such as circular reasoning.
- Finally, upper ontologies are very easy to bungle and they should be left to people with a Ph.D. in philosophy. There are excellent UOs out there (BFO is my pick, also see DOLCE Ultralite, possibly SUMO) and you should just use one (i.e. define your own classes as subclasses of UO terms etc.). Defining your own UO is like saying that you didn’t like the stdlib of a language and wrote your own. Furthermore, Upper does not seem like a real upper ontology, as it defines quite applied concepts (the idea of an upper ontology is to be above anything applied).
At the same time, I hope this gains momentum and maybe Netflix merges/aligns Upper with at least schema.org (popular for webpage metadata) or ISO/CD 23726-3 Industrial Data Ontology (quite popular in Oil&Gas).
P.S. While we are on the topic of things that might or might not mean anything at all - my favourites are atomless gunk and worldless junk. Are they real? We might never know.
Below are some links for extra reading from my favorites.
High-level overview:
Similar recent attempts:
https://www.uber.com/en-SE/blog/dragon-schema-integration-at… an attempt in the similar direction at Uber
https://www.slideshare.net/joshsh/transpilers-gone-wild-intr… continuation of the Uber Dragon effort at LinkedIn
Standards and specs in support of such architectures:
http://www.lotico.com/index.php/Next_Generation_RDF_and_SPAR… (RDF is the only standard in the world for graph data that is widely used; combining graph API responses from N endpoints is a straightforward graph union vs N-way graph merge for JSON/XML/other tree-based formats). Also see https://w3id.org/jelly/jelly-jvm/ if you are looking for a binary RDF serialization.
https://www.w3.org/TR/shacl/ (needs tooling, see above)
https://www.odata.org/ (in theory has means to reuse definitions, does not seem to work in practice)
https://www.w3.org/TR/ldp/ (great foundation, too few features - some specs like paging never reached Recommendation status)
https://open-services.net/ (builds atop W3C LDP; full disclosure: I’m involved in this one)
https://www.w3.org/ns/hydra/ (focus on describing arbitrary affordances; not related to LinkedIn Hydra in any way)
Upper models:
https://basic-formal-ontology.org/ - the gold standard. See https://www.youtube.com/watch?v=GWkk5AfRCpM for the tutorial
https://www.iso.org/standard/87560.html - Industrial Data Ontology. There is a lot of activity around this one, but I lean towards BFO. See https://rds.posccaesar.org/WD_IDO.pdf for the unpaywalled draft and https://www.youtube.com/watch?v=uyjnJLGa4zI&list=PLr0AcmG4Ol… for the videos