Development process 101
How do things actually happen in the Tech industry?
You will receive a notification about the new version of the application. You update the application and suddenly it works better or has a new cool feature. However, what happens before that notification is far more interesting. The development process of tech companies is not talked about enough in our market. If you're a programmer, you often won't know what it's like to program in a company, even if you pass three rounds of interviews and several conversations with the HR team. That's why we decided to peek behind the curtains and see what the development process looks like in a tech company. We turned to experts from United Cloud and spoke with Igor Tanacković, Chief System Architect, to tell us first-hand what it looks like in United Cloud. You can read his interview below, so let's get started...
The specificity of United Cloud products is that they are always available. For example, the EON TV platform works 24/7 every day of the year, on a large number of devices. This means that in production everything has to run like clockwork - because the platform never sleeps, just like the people who binge the series at three in the morning.
Below you will read how pre-production, production and post-production look like in United Cloud. But first a little theory.
Trunk based development
Trunk-based development (TBD) is a branching model that implies that every code change is integrated into one, central branch that we call Trunk (main or master). So, whatever we develop, we make a branch from the trunk and integrate the changes back into the trunk. Build, that is, the new version of the application is always "extracted" from the trunk. Trunk-based development is a prerequisite for Continuous Integration. The strategy of Continuous Integration implies that integrations are done as often as possible so that we can get feedback as early as possible, and Continuous Deployment implies that we constantly update the production. Therefore, the essence is to integrate the code as often as possible and update the production in smaller iterations, and smaller iterations imply a narrowed field for potential problems in the code.
The complexity of TBD is that by integrating non-functional code (bugs) into the trunk, the integrity of the build is violated. Also, the more developers working on the project, the greater the risk of integrating bugs into the trunk. We solve this complexity with built-in quality control, that is, with automated testing. And now we come to the very stages of production.
Pre-production
There are several critical points in pre-production. When you write the code, before you branch back to the trunk, you will do the first test. That first test is a test of critical features. The first test should be as short as possible. If you code in 20 minutes, it's pointless to wait three hours for the test. At this stage of pre-production, we try to reduce the waiting period as much as possible, and test only essential things. For example, if a video on EON TV cannot be played, it is a fundamental bug that must be fixed immediately. We call such bugs "interruption of service". But if the icon is moved three pixels to the left, that's something we can live with and fix later.
The next test is a daily or nightly test. At this point, automatic tests are run, independent of developers and development. The widest range of tests is conducted here, because it doesn't matter whether it will last 2 or 7 hours. When you come to work tomorrow, you get extensive feedback, and you can start fixing bugs. In this test, we prioritize quality over time.
Only when these tests are done, we can be sure that our build is good enough to go into production and won't cause any problems with the application update. This concludes the pre-production phase.
Production
When we say production, we mean the final environment where the build is publicly available. As we said, Unitet Cloud has a product that needs to be available all the time. Therefore, there is no moment when the EON TV platform is not used and when it can be updated with new features. It is important to find the right balance here.
On the one hand, the new build brings value to users. It will remove some bugs, add some features, improve the user experience. On the other hand, every build is a risk that something might not work properly.
That's why United Cloud applies iterative deployment to production. This means that a part of the users gets a new version of the product, and then we monitor how the platform behaves. If everything works as it should, then let's move on. So, let's start with 20% of devices, then increase to 50%, and so on until we change the production for 100% of devices.
Post-production
However, what happens after the actual production is more interesting for the developers. Then we get feedback about what we have encoded. Feedback shows us two important things:
1. Cases (use cases) that we did not foresee. It's only when people start using the new feature that you realize that maybe we haven't covered them all. Users use the product in a variety of ways. We can't even guess what they will do. So some new cases are happening here as well. We later update our test cases when something like that is found.
2. Performance degradation in production is more important information for us. So, with the new build, did we introduce some performance problem or some memory leak that we could not crack in a small number of test cases.
That is why various techniques are used in post-production. For example, we use monitoring, alerting and test techniques, such as content duplication, traffic duplication, etc. Monitoring is important to us in order to get feedback from the production, to see what is happening and possibly react preventively if we see that something is wrong. For example. often memory graphs grow linearly. All this would not be a problem if there was unlimited memory, but since it is limited, when it reaches a critical point - the service stops. That's why we actually set the alert to 80% and then it says that we have a problem somewhere. When that happens, we have to assess the situation. Sometimes that growth is fine, expected and will not continue to grow. But sometimes it is not expected. That is the problem we are solving.
As an additional level of user protection, we have introduced automatic rollback. So if something doesn't work, the app will automatically roll back to the previous stable version.
What's in it for you?
If all this seems a bit complicated to you - you are right. There are a lot of steps, even more tests. Nevertheless, this process is used by all the largest companies in the world - from the streaming platforms that we all watch, to the largest search engines on the Internet.
These are the necessary steps that United Cloud has decided to take in order to ensure the quality of development. A large number of outsourcing companies do not have the freedom to design their own process but take over the process of the company for which they are developing. Product companies, like United Cloud, can choose what kind of development process they will form. That's exactly why they made the most of that privilege to create a process that works well.
A lot of it is automated, so not everything is done from scratch. However, the truth is that it takes time to go through and understand the whole process. However, once you do it - you have comprehensive knowledge. And regardless of whether you write one line of code or a thousand - you will go through the entire system. And the best thing: the knowledge you gain in this process is something that no one can take away from you - and you can apply it everywhere.
And the results?
How well this system works is evidenced by United Cloud's up-time of 99.9998%. This means that this service works almost non-stop. But more importantly, we know that developers who get used to this system often become evangelists for the process, even when they leave for other companies. It is proof of the success of the system.
If you found this text interesting, then take a look at our first interview with Igor - Coding in a complex domain.
And if you see yourself in this system, then look at the open positions of United Cloud on Joberty. Maybe you will be the next player in this well-coordinated team.