Post by J. Hereth CorreiaPost by Peter BeckerHello all,
some of you might be interested: I just started the package
org.tockit.relations in the Tockit Java branch. The idea is to implement
a rather complete (although not necessarily efficient) implementation
for relational algebra. At the moment I moved bits of the Tupleware
model into it, operations (selection, projection, union, crossproduct,
intersection...) will follow. The model allows attaching names to the
dimensions, but this is optional. The sources as used in Tupleware with
their UI parts might follow later on -- we probably should clean up some
bits first.
In case you have specific interests in this area feel free to ask for
features -- they might be easy to do while I am working on this topic
anyway.
Peter
Hi Peter,
I think it's good to work on those basic structures. I still hadn't the
time to look into tupleware, so I have only two very general ideas. The
first one is about naming - make it usable, i.e. allow for selecting
columns by name, provide the interface to ask for all column names.
That is something I consider to be application-specific. Tupleware
handles the column names (at the moment I call them "dimension names",
but I don't really like it). It gets them either from the first line
(text file/CLI input), or the query variable names (SQL/RDQL). But the
library should keep them optional IMO.
Post by J. Hereth CorreiaAnother point would be the domain of each column, e.g. to know that in
the 'PLZ' column only postal codes are allowed. So far, I have no idea
how this could be implemented, maybe for the start simply a string with
a description, which might be also an URI identifying a set.
I don't want to go down this road yet (but probably later). In Java I'd
probably use what I call "dynamic RTTI" -- I'd attach a Class[] to the
relation and check the Object[] in the tuples to be instances
elementwise. If you want other mappings, e.g. to SQL types, XSD types or
something else, you can wrap that in Java types (the SQL mappings are
actually part of JDBC and the XSD mappings is something I want to do for
Siena).
Post by J. Hereth CorreiaAbout the operations, i'll have to wait until I see what you have done.
Permutation: Reordering of columns, makes no difference when using names
Do you think I should combine that with projection? I haven't made up my
mind about the interface for projection yet -- I think it should get an
int[] as parameter, denoting the dimensions to leave in. Should they be
assumed to be in order or do we allow implicit permutations? In the
latter case we could treat permutation as a projection onto as many
dimensions as the input. Alternatively one could give dimensions to
drop. Or even this:
public class ProjectionOperator implement UnaryRelationOperator {
public static ProjectionOperator getPositiveOperator(int[]
dimensionsToPick);
public static ProjectionOperator getNegativeOperator(int[]
dimensionsToDrop);
public static ProjectionOperator getNegativeOperator(int
dimensionToDrop);
public static ProjectionOperator getPermutationOperator(int[]
dimensionsToPermute);
// non-factory code
}
Post by J. Hereth CorreiaJuxtaposition: another name for cross product
Can we call it crossproduct then?
Post by J. Hereth CorreiaJoin: select two columns of a relation. The resulting relation has all
tuples that have the same values in those two columns and these two
columns are *removed*
Sure. I forgot to enlist that one, but of course it should be there.
Post by J. Hereth CorreiaNegation: The complement of the relation. neg(R) is the cross product of
the domains of the columns minus R.
That is more tricky since it involves understanding what the domains
are. Of course we could give the constructor/factory method a Set[] with
the full domains.
Post by J. Hereth CorreiaTeridentity: a constant, the set of all triples (x,x,x) for any x.
That is not an operator, is it? But we could have some method to get the
teridentity for a particular set. The current model will not be able to
handle the general case since relations are sets of tuples.
Post by J. Hereth CorreiaSo far, so Peirce ;-)
And finally, I'd like to have transitive closure for binary relations.
Forgot about this one. Any other closures?
Since I don't care about efficiency for now these things should be all
easy to do. I try to get a rather complete set.
Peter