Sunday, November 8, 2009

Houston, we have a problem

Engineering was about to execute a database migration script and we had tested it several times before on a separate environment. However the weekend had come where we shut down the production server to do our jobs.
My job was to make sure, after the migration, everything worked fine. There was only one problem. On that weekend, I had a problem with my VPN connection. It was essential for me because without connection it wouldn't be possible to test whether the migration was successful or not. When I raised the call I got a message saying that there is no support on the weekend for internal IT occurencies...
...Fortunately I could test without having the VPN, but I could not take part in the decisions and coordination of the process because my business email could not be connected anymore. Although I informed all involved parties about the issue and that they should send all related information to my private email rather than using my business email, I only got half of the messages. It was a small challenge to get all required information but I still managed to do the job. Still I asked myself what those guys are doing if someone decides to perform a DNOS attack on the weekend.
But I have to be grateful. What surprised me most was, that immediately on Monday, a nice guy from IT came to my desk and showed me how I could access our business email from home without the need of VPN....

Tuesday, November 3, 2009

Declining Confidence

Wishful thinking.

Wednesday, October 21, 2009

Abschied

Lieber P., Du bist nicht mehr da, und ich kann es einfach nicht fassen. Deine Familie lässt Du zurück und auch sie können es nicht verstehen. Wer kann es schon. Obwohl besonders Du und Deine Familie wussten, dass das Leben ein Geschenk sei und nicht selbstverständlich, so hat mich Dein Abschied dennoch völlig überrumpelt.
Leider merkt man erst in solchen Momenten, was man hatte und wie sehr man die Freundschaft schätzte. Und nun kann man sich nicht einmal dafür bedanken. Wie kann der liebe Gott einen bloss so früh zurückholen? Es ist nicht recht.

In diesem Jahr teilten Deine und meine Familie so viele Erlebnisse, dass man wirklich dankbar sein muss, dankbar für jede Minute und dankbar für jeden Moment, den unsere Familien zusammen erleben durften. Dankbar auch für die Dinge, über die wir uns gemeinsam geärgert und auch lustig gemacht haben.

Deine Stimme und Deine Bilder sind so nahe wie zuvor. Ich höre Dich genau, Dein einzigartiger Gruss wird mir immer in Erinnerung bleiben. Ich wünsche mir, nein ich verlange, dass wir uns einmal wiedersehen werden, wenn auch in einer anderen Welt. Es muss einfach so sein, alles andere ergibt keinen Sinn.

T.
Publish Post

Monday, October 12, 2009

Imagine there is Pandemia

If you followed my blog entry as of August 2, titled "What does Swine Flu have in common with Testing", you may remember the number of deaths the government was expecting...

...until today, there was not a single one.

Someone else's Bug

Most of the real "cool" bugs that I find, usually don't show up during automated testing. They also don't show up during the execution of manual regression testing. Of course, both approaches are successful within the restrictions that are well-known. But the bugs that show up during in depth analysis of an already reported and hard to reproduce bug, often "generates" new findings of anomalies that I often didn't expect at this place.

To bring an example, my goal was to find the root cause of a particular error more and more customers ran into. It became a high priority when suddenly 100 customers were affected by it.

When analyzing real customer defects, my mind works differently than during normal testing activities. I have now a different thinking and ask questions like "Which scenario(s) is/are candidates that drive the application into the reported behavior.

While the functional testing techniques worked well without a customer in mind, the new approach takes the customer into foreground. I now wear a different hat. In order to understand what happened, I need to know how a customer uses the system in an end-to-end environment.

The first approach is usually to check the logs that help-desk provided and then hope to find the cause fast. The help-desk usually provides a log of one error at a specific point in time only. Sometimes you need more information, for instance, how did the user get that object he was unsuccessfully working on, who sent the object originally to him, who accepted it when and who forwarded the object at what time, who added data to the object, etc.

The log from the support gives you the necessary base information. Now I dig deeper into it to find more information about the object's "life cycle". Besides the client application log and besides the web server log files we have something like an Event-Log attached to each object.

The Event-Log provides information about which action has been executed by which user and in which state the object ended up after this operation. That sounds great but I've always had this strange feeling that something is wrong with this Log. There were too many inconsistencies and I feared I lose time while digging into this other area.

This time, I had no choice, I had to understand each action and each state it ended up, so I could reproduce the same scenario in our test environment. And since again, I couldn't quite follow up the weird order of the log entries and also the time-stamps didn't really make sense to me. It was now time to get into this new area of potential defects. I started to create an object from scratch. I created it and then immediately checked the event log and noted down which actions and states the event log produced. I append data to the object, sent it to someone else, pushed it through several web services and for each action made screenshot of the event log. Step by step I tried out varieties with the only goal to understand the patterns of the logs.

What I found was astonishing. My original assumptions are now confirmed. These entries are incorrect. To make it short, these entries didn't tell you the truth. When asking our support people whether they ever realized they ever realized the clutter in the event log sequence and time stamps they confirmed that they didn't like to read the logs because of the same reason I didn't look at it. They just could not follow these entries. I then talked to the architect which – to my surprise – confirmed that there are some known issues with the event log. Strenthend through this experience, I raised a bunch of new defects which were related to this Event Log. This was just one example. The truth is that I usually run into more such "side" effects

And now just happened what often happens while I am testing: I drifted away. I was out to find the root cause for a completely different defect and found a bunch of other new interesting anomalies which didn't have to do anything with the original defect.

BTW, what was the problem that I was originally out for testing?




Thursday, September 3, 2009

Configuration Issue 2

Tuesday, September 1, 2009

Password Expired

If you ask me where I got the inspiration for this cartoon, then I must admit that it was quite a long detour to this scene. Actually you can't refer to one particular incident. How about the diplomatic affair between Lybia and the Swiss Government, where our Federal President is just about to learn some basic lessons.

Or what about that nice instruction from IT that makes us strictly not to put any passwords on paper and then hands out PIN codes to users on papers themselves?

Another example is that story of one employee that was passed to me after he saw the cartoon. He claimed to have access to a special server room which is important in case of emergency. When he came back from holiday, his access had been disabled. He spent an hour of investigation why he could no longer access the room, until he found out that IT disabled all accounts of which they could no longer find out the owners. According to him, it wasn't the first time. Luckily, until now we didn't have any emergency scenario where that access was needed immediately.