constrained algebra of set theory for querying relations [tuple calculus?] or [APL is to arrays as RA is to tuple sets]
uses sets of tuples (rows) in a relation (table) which share attributes (columns) [relation is tuple set over attribute]
Symbols:
∪ union
− set difference
× cartesian product
σ selection (WHERE)
π projection (SELECT)
ρ rename (naming is important, after all lambda calculus is just about naming things)
∩ intersection
⋈ join
γ for grouping
∪ − × σ π ρ are minimally relationally complete as R ∩ S = R − (R − S) and R ⋈ S = σ cond (R × S). You can also do R ∩ S = R −(R − S)
SQL diverges from RA
duplicate rows while relations are sets. Tables are multisets because removing duplicates is computationally expensive
RA lacks order
null/3-valued logic - RA is 2-valued logic and assumes a closed world (no unknowns)
aggegation and group - Codd’s RA only works on wholes (but many use γ)
RA resembles APL
declaratively compose bulk ops
RA works on unordered structural records (like maps), APL uses ordered multi dimensional grids
Tuple (row) resembles an immutable map. Relation is set/bag of these maps. (In first normal form) RA uses joins, not nested structures, which we can see as precomputed joins!
RA
APL
FP
Note
σ
⊃/
filter
By predicate
π
indexing
map
Implicit in APL
∪
∪
concat
×
∘., or just ,
mapcat
ρ
←
def or let
⋈
⌿
mapcat then filter
−
/ or `~
filter
By neg. predicate
Prolog emerges from RA
state facts/relations and query
Prolog can create new facts with rules or infer them from rules on facts
Datalog is Prolog without funcs, so all querries resolve [are finite]
Set theory defines set explicitly (enumerate or comprehension). RA defines set with operations on existing sets. Prolog intensionally defines sets with rules, describing the properties of members.
extensional - list set’s actual members
intensional - list rules members must satisfy by relationship to others
Functional Programming Pipelines/Higher Order Functions
Different paradigm but same goals?
Query plan is tree, steps/transducers are a pipeline
Functional programming is operative via composition [you write how to transform, not describing what relation/answer you want. FP is like writing an execution plan, which the query planner does…] RA describes the result while FP describes transformations to reach the result (what the query planner does)
WHERE is like filter, SELECT like map
“collection programming” without loops declaring operations on set or composing traversal patterns like map, called “whole-meal programming” like APL too. But HOF are sequential and nested unlike APL’s. [IDK how to phrase this yet]
in RA reorderinng ops is semantically preserving (algebra has rewrite rules)
APL is set at a time like RA and has global rewrite laws
While FP pipelines work element by element, RA works on whole/arrays sets enabling algebraic rewrites and APL works on entire arrays (i.e. on whole relations)
Questions
Is idiomatic APL declarative?
Is it possible to write all programs via map/reduce/filter with ops/predicates inside? You do need mapcat
Is the minimum map, filter, mapcat, reduce, zipcol and comb?
To what extent is this true: FP pipelines are element by element, RA is row by row of elements, APL is table by table of rows of elements.
Can we abstract them all into 1 paradigm, operating on different dimensions?
Can we enrich any of the paradigms with operations from the others?