So I’ve been refreshing my Java skills, working through Deitel and Deitel’s “Java Standard Edition 8” training material. The first seven chapters have been pretty easy going, but I’ve been doing the usual – blowing out the simple coding examples so that they actually model the real world.
For example, when simulating shuffling a deck of cards, the sample code simply takes the entire deck from top to bottom, and swaps the next card with a random one below it. Of course this violates the way that a real shuffle works. In a real shuffle, the cards at the top of the two stacks of the cut end up closer to the top. So I wrote a random shuffle algorithm that simulates the cut, and merges the two by taking cards randomly from each stack until one is exhausted.
The next assignment is to capture some statistics on a set of test scores. It’s a pretty simply problem: minimum and maximum values and the average. But you know where that goes: at the end of the term, the scores for all the assignments have to be rolled up into some final grade. This seemed like an interesting problem – coming up with some general mechanism for aggregating scores into a final grade.
We all know how terms start: the teacher hands out a syllabus with a weighting for each element of the course work: homework, quizzes, mid-terms, papers and finals are typical elements. Each element is given an expected weighting to the final grade.
Of course, it never works out that way. Some midterms are harder than others, but each should contribute the same weight to the final grade. This is sometimes accomplished by weighting the test scores so that the averages are the same. And what if the students move through the material faster or slower than in prior years? Might they not complete more or less assignments than expected?
So this simple little fifty-line program became a ten module monster. I can’t entirely blame my son Gregory for the damage done by my interview with him on grading policies at the JC he’s attending. But he did bring up a really interesting point: nobody but the professor knows the actual assignment scores. She produces a final letter grade, and that’s all that the records office knows.
We were trying to decide how to model this, and came up with the idea of the professor having a grade book with a private set of identifiers that link back to the student records held by the registrar. After each assignment is graded, the instructor looks up the grade book ID for the student, and adds the grade to the book against that ID. At the end of the term, the professor combines the scores to produce a class curve, and assigns a letter grade for each interval in the distribution. In the end, then, no student knows how close they were to making the cut on the next letter grade, so nobody knows whether or not they have a right to appeal the final grade.
In my code model, therefore, I have two kinds of people: students and instructors. Now we normally identify people by their names – every time you fill out a form, that information goes on it. But sometimes names change.
In the grade book, of course, we also want identities to remain anonymous. We need mechanisms to make sure that IDs are difficult to trace back to the person being described. The NSA did this with records subpoenaed from the phone carriers – though nobody was convinced that the NSA wasn’t bypassing the restrictions that were supposed to prevent names from being linked to the phone calls until a warrant was obtained from a court. In the case of my simple gradebook model, it’s accomplished by making the class roster private to the “Instructor” class.
This all got me to thinking about how slippery “identity” is as a concept. It can be anything from the random number chose by the instructor to a birth certificate identifier to a social security number to a residence. All of these things provide some definite information about a person, information that can be used to build a picture of their life. Some of it is persistent: the birth certificate number. Other identities may change: the social convention is that a woman changes her name when she marries. And in today’s mobile world, we all change residences frequently. A surprising change in my lifetime has been that my phone number doesn’t change when I change residence, and the phone number is a private number, where once it was shared with seven people.
So as I was modelling the grade book, I found myself creating an “Instructor” class and a “Student” class, and adding a surname and given name to both. I hate it when this happens, and in the past I would have created a “Person” that would capture that information, and make “Student” and “Instructor” sub-classes of Person. But that always fails, of course, as what happens when an instructor wants to sign up for an adult education class?
And so I hit upon this: what if we thought of all of these pieces of identifying information as various forms of an “Identity”? Then the instructor and student records each link to the identity which could be a “Personal Name.” That association of “Personal Name” with “Instructor” or “Student” reflects a temporary role for the person represented by the identity. That role may be temporary, which means that we need to keep a start and end date for each role. And the role itself may be identifying information – certainly a student ID is valid to get discount passes at the theater, for example.
The subtlety is that addresses and old phone numbers are reassigned to other people every now and then. The latter was a frequent hassle for people that got the phone number last held by a defunct pizza take-out. And it’s even worse for the family living right in the middle of America, which is the default address for every internet server that can’t be traced to a definite location. The unfortunate household gets all kinds of writs for fraud committed by anonymous computer hackers.
But I really wish that I had a tool that had allowed me to maintain a database with all of this information in it. I don’t think that I can reconstruct my personal history at this point. As it is, what I have in my personal records is my current identity: my credit card numbers (which BofA fraud detection keeps on replacing), my current address and phone number, my current place of employment. That is all that the computer knows how to keep.
With the upshot that I know far less about myself than the credit agencies do.