Relational Python: Projection


Now that we have a basic class for relations and a method for displaying them, we'll now start to go through some relational operators, starting with PROJECT.

PROJECT is defined such that if rel1 is:

+-----+-------+-----+--------+
| ENO | ENAME | DNO | SALARY |
+-----+-------+-----+--------+
| E1  | Lopez | D1  | 40K    |
| E3  | Finzi | D2  | 30K    |
| E2  | Cheng | D1  | 42K    |
+-----+-------+-----+--------+

then PROJECT(rel1, ["ENO", "ENAME"]) is:

+-----+-------+
| ENO | ENAME |
+-----+-------+
| E1  | Lopez |
| E3  | Finzi |
| E2  | Cheng |
+-----+-------+

It's sometimes useful, even when not dealing with relations, to be able to do projections of dictionaries. The following function does that:

def project(orig_dict, attributes):
    return dict([item for item in orig_dict.items() if item[0] in attributes])

This can then be used to define PROJECT:

def PROJECT(orig_rel, attributes):
    new_rel = Rel(attributes)
    for tup in orig_rel.tuples():
        new_rel.add(project(tup, attributes))
    return new_rel

(Note that if Rel took an iterator over tuples in its constructor, this could be simplified further—I might do that at some stage)

This PROJECT function implements the relational operator PROJECT. It makes a new relation based on a point-in-time snapshot of another. However, it's easy to make the projection dynamic as well.

The following class allows one to create a projection of a relation that is dynamic. In other words, it is a projection of the current state of the original relation not just at a point in time.

class PROJECT_VIEW(Rel):

    def __init__(self, orig_rel, attributes):
        Rel.__init__(self, attributes)
        self.orig_rel = orig_rel

    def add(self, tup):
        raise Exception

    def tuples(self):
        for tup in self.orig_rel.tuples():
            yield project(tup, self.attributes_)

rel3 = PROJECT_VIEW(rel1, ["ENO", "ENAME"]) works just like rel2 = PROJECT(rel1, ["ENO", "ENAME"]) except that if new tuples are added to rel1, then rel3 changes whereas rel2 stays the same.

As always, I welcome people's suggestions as to how to improve this.

The original post was in the categories: python relational_python but I'm still in the process of migrating categories over.

The original post had 2 comments I'm in the process of migrating over.