Our Learnings from Five months without Production Feedback
by Finn Lorbeer
With the start of the coronavirus pandemic in March 2020, our economy had a hard time to learn that any business can come to a drastic stop. With the lockdown, our customers stayed at home. And so did our vehicles. In those days we painfully realised what an impact the absence of customers has for us in very different ways.
There are different phases for each company and each product: New Ideas are discovered, new concepts and inspirations. Then people define what they want to build. A given solution is developed(most often some coding is involved here) and finally the product is delivered to customers — the product is “live”. And that state of “being live” usually never changes until the product is not used anymore and sundowned many, many (many!) years later.
But in this case, something else happened: For all our production systems, the pause-button was hit. This is a very uncomfortable position for a customer-centric, data-driven, devops-culture-company like MOIA. In this blog, we want to look into these three aspects.
What happens to a customer-centric company without customers?
With no vehicles on the streets and no passengers being transported, we received no more feedback from customers and drivers. Neither directly nor via social media.
In normal operations, we have a team looking into this feedback. Learnings are taken very seriously and usually action is derived from it.
If an entire company and many teams direct the development efforts (also) towards this feedback and are curious to learn from it, it is quite frustrating if this is suddenly missing entirely. But with the start of our night service in Hamburg, literally the entire company was happy to receive feedback from the driver and passengers in our Plutos, our vehicles, again. Needless to say, we are looking forward to receiving more again, when we fully start our service. We have not been idle in the months of the lockdown but kept working on new features and improvements where we have strong and historically consistent user feedback. Furthermore, we started to do remote user testing, to validate the most important basic assumptions of our features. Now, we are curious to learn how they are really being used. Hopefully, they are mostly used in the very way we intended them to be used :-)
Being Data-Driven without Data
Direct feedback from customers and drivers is great. But we also want to learn about the bigger picture: how many people tried to ride with us but did not get a vehicle? Are we having fewer delays in this month compared to last month? How long is the average travel time? And how often are we late because we waited (too long) for a single person? What impact does the weather, week of the day and daytime have on the demand? And also: How do changes to our algorithms influence all of this?
Those are just some of the questions we are asking ourselves every day. This is why we have a couple of teams who ensure we are collecting, processing and analysing the data from our apps.
Still, in order to properly answer these questions, the different modes of lockdown, lockdown-light, curfew and waves of the pandemic with full-service, night-service and no service make it even more difficult to have a good data baseline. Hence, we have to work with approximations and compare similar situations with each other instead of just looking at larger historic data sets.
A good mitigation strategy here was to align across the entire company which features were built based on “strong” data. And which Features were rather hypothesis driven. We had a couple of alignments across all the teams to derive a priority for testing in production. Since some teams need a bit more data to support a hypothesis and some features require a bit more analysis, we’ve spent significant effort to align across the teams where we need more data or testing or validation. We understood what do we need to keep an eye on (because we are a bit unsure about the impact). And we know where we can we rather safely deliver a new functionality, since there are no (or rather just very few) uncertainties around how it will be used. We are now ready to deliver a lot of our new features in order to find out where we can still improve and where we delivered (more or less) directly to customer expectations.
DevOps without Monitoring
MOIA is running the services in a “you build it — you run it” approach in the teams. Since in pre- (and post-) pandemic times we are operating day & night, we need to be able to react to production issues at any time. Also at night. As a consequence, teams are thinking about which alert will wake them up at night and…
- thinking in detail about their alerting and the severity of the alerting as well as the documentation for the first-responder
- challenging themselves on code that goes to production — in a positive way. Apart from the true drive of each individual team member to deliver the best possible product to our customers, people have a genuine interest in smallish mistakes to not wake them up at night :-)
We also have “Observability Champions”, individual team members who are particularly interested in observability and QA-in-production. They are driving activities in our teams around monitoring, alerting and metrics. Needless to say, that the below graphs are neither joy- nor helpful:
With the night-service we gained some more insights again. But since the service hours started late in the evening and ended in the morning, we could only analyse releases with a delay of 12 hours.
The way out of this situation was to produce our own data. We increased efforts to simulate our services, we had a big increase in sessions where we were doing virtual rides in virtual vehicles with people across the company.
That last part, bi-weekly manual testing across teams, was very interesting for me personally. I am usually a strong advocate for a combination of highest degrees of test-automation in combination with good monitoring. But in this new reality, I was simply lacking the monitoring-part of “my” concept. This was the first time in many, many years that I was happily part of large and long cross-team testing efforts.
How to meet “new” people
Furthermore, these large testing sessions had a very nice side effect. Since the start of the pandemic, and the switch to home office, communication has become a bit trickier. We adopted our routines and reserved extra time in our calendars to compensate for the lack of watercooler-socializing. But while this is comparably easy to schedule and orchestrate on a team level, it is much harder to accomplish the same thing across the company. It is difficult as a new starter to bump into any random person of a different team. But during the (partly virtual, partly real-world) test drives exactly this happened: people who did not meet before found themselves in one group working towards a common goal.
Talking about bumping into some random fellow co-worker, we also had some people who genuinely re-built our real Hamburg office into a virtual space. Here, we can now meet and socialise and interact with ease in smaller groups. We are able to “walk” through the office, meet in smaller groups and have larger meetings. If you ever had the chance to join us for one of our meetups, you may recognise the layout of our office. Additionally, we celebrate our monthly digital office party in this space.
In conclusion, from a product-development perspective, we couldn’t be happier to go back to service. For our customers, of course, but also to be able to follow our core values in delivering software again.
Being Data Driven.
Embracing our DevOps culture.
Finally, we are fully back in service!
Acknowledgements: Thank you to Jannik Arndt, Gabi Wegner, Matthias Friedrichs, Julia Wissel, Tina Scott and Annika Kories for their insightful suggestions and comments.