Tag Archives: Tech Problems

On doing things right the first time

If you haven’t got the time to do it right, when will you find the time to do it over?

– Jeffrey J. Mayer

If there is one lesson I have learned at my current job, it is that you should always, always, take the time to do things right the first time, especially if everything else you do will be built upon it.

For example, about a year ago, I was tasked with migrating an aging access database to MySQL. The Database was horrible. There were just about no checks in place to prevent inconsistent records (Changing an ID in one place would not update other references to that ID in the remaining parts of the database), a bunch of very inefficient data types and needlessly complicated ways to store values (For example, instead of using a boolean value to indicate if a course was offered at a specific university, the column would either contain NULL (or the access equivalent) or the name of the university.

Now, someone had already imported the database layout into MySQL, so I took a look at it. I wrote down several ideas on how to improve the database, including the usage of “Foreign Keys” to preserve consistency, boolean variables to store true/false-Values, the use of IDs instead of names as primary keys, and so on. Once that was finished, I talked those proposals through with my employer and we discussed the changes. Most were accepted, some were scrapped, including the foreign keys for some reasons I cannot remember (but which seemed somewhat logical at that time).

The database has been in use for a while now, and it has come the time to add some functionality (triggers to automate boring tasks and new ways to display the data, for example). So, for the first time in about 6 months, I took another look at the database, and at the data it contained.

My initial reaction:

Some bad decision on my part, combined with bad decisions when creating the frontend, combined with the fact that the database was accessed through three different interfaces, and only one of them having any form of sanity checks, the database had deteriorated into a state which made it nigh impossible to do anything without at least three different ways of abusing MySQL in ways that would have made my databases instructor weep.

The problem? The database is in use. I cannot change it, at least not within a reasonable timeframe, and without rebuilding most of the frontend (which was created with a piece of software called “PHP Generator for MySQL”, which has some nice features but also frequently throws a hissing fit if you do not use it in exactly the way it thinks it should be used, not to mention generating code that is vulnerable to the most basic of XSS (which has been reported by me for more than 9 months, by the way)). That would be an effort of at least 20 hours, which could have been avoided by one or two hours of work improving the database structure in the beginning of the project. And I don’t have those 20 hours.

My contract with my employer will end soon, and once me and my supervisor are gone, they will have no one left who can even understand MySQL. I pity the person who will inherit this database and try to make sense of it. I am trying to document things as good as I can, but there is only so much you can do for a database that should have never existed or been put out of its misery a long time ago.

Seriously. Do things right the first time, or you will regret it in the long run.

Those Wonderful Evenings or: What can go wrong, will

It is the general consensus that IT Problems are pack animals. This evening once again proves this.

It all began in a completely harmless fashion: For a week or so, I had been having problems with my work eMail Account. The Organization I work for hosts their eMail Infrastructure at Microsoft, Office365 to be precise. This enables us to use the full Exchange functionality at no cost (Since it is an educational institution, Office365 is free for us). Personally, I could not care less about that, since I am using Linux exclusively, but hey, as long as I can get my mails, whatever.

So, about a week ago, Thunderbird started reporting errors when connecting to my eMail Account. It babbled something about the login to the server failing, offering me to Retry, Change my password or cancel. I hit retry a few times, and eventually the message stopped. I assumed Thunderbird had finally managed to get a connection with the Server and happily kept doing whatever I was doing. I never wondered why I wasn’t getting any mail, since it was very rare for me to receive any mail on that account anyway. This was mistake number one: Assuming that Thunderbird would not stop trying to connect without explicitly telling me about it.

This impression was intensified by the fact that my daily backups of my mail accounts were running just fine. My Cronjob reported no errors. And, at the beginning, my Android Phone running K9 Mail kept getting the messages I got sent. The phone did not report any new mails either, and no errors, so I assumed that everything was fine, since even if Thunderbird was silently stopping the connection attempts, and even if my daily backup was having unreported errors, at least the phone would surely complain if it was unable to get a connection with the servers. Triple redundancy in error reporting, what could possibly go wrong?

Famous last words.

So, today, the error messages of Thunderbird finally pissed me off enough for me to investigate. Our eMail Service had recently been migrated (by Microsoft) from Live@Edu to Office365. The Documentation for the upgrade claimed that no changes would have to be made, so I left my Settings the way they were, and they kept working happily through the upgrade, even permitting me to send a notification to the organization, notifying them of the completed migration, and fixing a few Problems that occured afterwards. That was all before my problems set in.

So, as I said, I started to investigate. I tried to find out which servers I was supposed to use, and updated my Thunderbird config. The problems were still there. Curious, I logged into webmail to check if my account was still active and my password still worked. It was and did. After the login, I was greeted by “9 new messages”, the oldest going back to last monday.

I will not bore you with my struggles to get Thunderbird working. I triple checked password and server settings, changed my password, waited 30 minutes, nothing would work.

Curious how my Android had kept working through all of this (or had it? I had never seen those messages after all), I started up K9 Mail and tried to refresh my account. It went through without error message, but also without downloading the new messages. I updated the server information and suddenly, I got an error message, claiming a wrong password. Great. After deleting and re-creating my K9 Mail Config for the account, I still could not get it to work. K9 Mail had not been able to connect to the server for a whole week, but had not seen fit to inform me of that. Awesome.

Now, I was really interested in how my backups had kept working through all of this. I manually ran my backup software (I was using OfflineIMAP), only to see that the Program was throwing an exception when trying to connect to my account. The exit status (“echo $?”) was still zero though, indicating success. Frustrated, I hit up their GitHub-Page, intending to write a bug report, when I realized that I was running a horribly outdated version that I had installed from the Raspbian-Repositories (Debian for Raspberry Pi). I removed the old version, installed the current one, and retried the run, being met with an Error about cert_fingerprints not being set. The Program still exited with 0, by the way, even though someone who was running an automatic update of the program using apt-get, for example, would never have seen this change, and thanks to the success indicator of the exit status, would have never been notified that his backups were failing. I wrote up a bug report, fixed the config file, and tried again. Now, I was getting the “LOGIN FAILED” I expected, but the Program STILL exited with a Status of zero. I sighed heavily (actually, I cursed loudly), updated my bug report and mailed Microsoft Support about this problem.

It has been two hours now, and I have found:

  1. One case of bad coding in Thunderbird (not reporting when stopping the connection attempts)
  2. One case of a lacking Error messages in K9 Mail
  3. Two cases of a potentially fatal wrong exit status on OfflineIMAP
  4. One case of WTF about Microsoft (Seriously, why doesn’t this crap work?)
  5. One case of foul mood and desire to punch cute kittens

Lessons Learned:

  1. Don’t rely on error messages being there if you have never seen them
  2. Don’t rely on the exit status of software you have not written yourself and / or tested.
  3. Don’t be sure that since you have three different ways of being notified when something goes wrong, you actually will, unless you have tested at least one of them (Basically 1 and 2 combined)
  4. Even (or: especially) a billion dollar company like Microsoft can and will screw up, and they will probably not fix it if you do not complain.