How did this story start? It seemed fun to take a little trip down memory lane in the most romantic sense possible. To me, it meant remembering all of the programming related technologies that I used since I first started programming. That was way back in 2005, I think. And there were a lot of programming languages, frameworks, libraries that I used since then. It was actually difficult to remember all of them from the last 17 years. Luckily, my external hard drive still has most of my data, starting from when I enrolled in the Faculty of Science (organized by semester). Apart from that, I consulted my LinkedIn profile, Coursera certificates, and some Goodreads data 😀.
Some of the ones that I’ll mention in this article include: Erlang, Haskell, Scala, Akka, VueJs, .NET Core, and others. I’ve deliberately listed the technologies that I do not use in my day to day job, but the ones I believe taught me the most.
Why learn different technologies? Well, from most of them, or a combination of them, I learned something new, not only the technology itself, but the concepts and way of thinking. And that’s the main purpose of this article. To discuss how different technologies can help in learning and understanding broader programming concepts.
I learned different technologies from various sources: high school, faculty, Coursera, books, and, of course, work. Some of them I used very briefly, e.g. while reading a book about something, or for some course. Others I used for years. Sometimes for personal or work related projects, sometimes for teaching, sometimes I just liked the tool (language, library, etc.).
Well, what did I learn?
Some 10-15 years ago, when parallel and distributed programming started getting traction, functional programming concepts became mainstream. One of the most prominent of these is idempotency – which states that no matter how many times you execute a function you need to end up with the same result. The approach is encouraged when using modern technologies like Kubernetes, Airflow, Celery, and others. We use all of these on a regular basis.
I learned about it from Scala, SML, Racket, and Haskell. They also taught me how to program using immutable data structures, without variables and focus on writing recursive functions. I took a few Coursera courses: Functional Programming in Scala, Programming Languages, and bought a few books on Scala and Haskell. Afterwards, during my last year at University, I finally started to understand recursion and began to feel comfortable writing recursive functions.
Some time around 2012-2013 I started working with JavaScript, AngularJs, BackboneJs, Handlebars, C#, ASP.NET Framework 4 (I think it was), and Entity Framework on some paid projects. That was my first experience working in a team. I learned a lot from those guys about using Git, and web development. For the first time, it was important to think about HTTP with its different methods and how to transfer some content to a system. Perhaps most importantly, I learned about project architectures and how to organize code.
Learning about Erlang and Akka, I also learned about concurrency, having entities that are TRULY encapsulated, and messaging. When working with Akka.NET I was actually focusing on messaging, entity organization, and fault recovery. This was also the first time I heard about the let it crash approach. Basically, you organize the system in a way that errors or crashes in the system are isolated. You do not handle every possible error, but the isolated part of the system is allowed to crash because there is some recovery mechanism in place. I also learned how to create a cluster and expose it to the outside world.
During my PhD studies, sometimes around 2017-2018, I was actually interested in how programs can be represented with a dataflow graph. It led me to work on a web application (I used GoJS, VueJs, JavaScript and .NET Core – for the backend) that would allow students to create a program using blocks. Not a terribly new idea. Students could drop blocks onto a canvas and connect certain types of blocks to others. The data, the result of the block, would be viewable on the block itself. Although it failed to help with basic programming concepts, it turned out that it could offer nice visualizations for explaining functional programs operating on collections.
Docker – I had no idea what it is or how it works or how it differs from a virtual machine running in VirtualBox until I got the chance, by recommendation, on an ongoing project as a freelancer. An opportunity I briefly considered turning down.
From AIrflow I learned about data pipelines. But more importantly I started learning about related concepts related to extracting, transforming and loading data. Namely, ELT, ETL, EL… and you don’t need Airflow to implement these concepts.
How does this help with my day to day job?
Basically, many technologies rely on well-defined principles. The concepts you learn from one technology may be applicable to some others.
From learning functional programming, the thing I found most useful is idempotency. The approach is encouraged when using modern technologies like Kubernetes, Airflow, Celery, and others. We use all of these on a regular basis. When designing pipelines in either Celery or Airflow, the pipeline might fail. When you re-run a failed DAG in Airflow, you want the final outcome to be as if the failure did not occur. In other terms, you want it to be like the pipeline was run only once. Meaning, you want the pipeline to be idempotent. When it comes to Celery, knowing about idempotency led me to design some pipelines in the same way. We have some pipelines that remove some cached data once it is read, but if an error occurs the data should be restored to the cache so it can be picked up later. Idempotency is one of the most important concepts in distributed systems where a node may fail or messages may be delivered multiple times.
When it comes to Akka.NET, I think that messaging was the most important takeaway. Why? Well, oftentimes a single process will not be able to carry out all of the business requirements and it needs to coordinate with others. Celery in particular works by sending and reading messages from a queue.
Allowing things to crash and being aware that processes may have their own lifecycle is something related to Kubernetes. Kubernetes pods are randomly restarted as part of their lifecycle with a grace period before termination.
Akka also taught me about lifecycle hooks and to think of how to expose an apparently isolated system to the outside world.
When learning about concurrency, you inevitably learn about common bugs associated with it. Such as data races and deadlocks. When designing pipelines, which may be run in parallel, we think about these problems. We might not catch every edge case. For example, we deployed a pipeline to production. And after some 10+ months a data race occurred. But we were able to recognize and fix it. Not only that, we were able to reason about the implications of that fix and how to improve.
What about web development? Well, many applications, including web applications are organized as a set of microservices. These also need to be able to coordinate. The most basic web application is basically a client-server system. There you need to consider communication between a JavaScript client and some server. And I already mentioned that it was while developing web applications that I first started thinking about project architectures and HTTP requests.
Why did I mention dataflow and graphs? Well, turns out that’s what data pipelines are. You know, the things I mentioned multiple times up until now.Additionally, many other technologies focus on this idea. Google’s Dataflow is a cloud managed service that allows building pipelines using Apache Beam and represents them as graphs. Disclaimer: never pass large amounts of data between Airflow operators.
Many popular programming languages today borrow from both object-oriented and functional concepts to deliver the best of both worlds. For example, Entity Framework that is used in C#, an object-oriented language, to interact with databases relies on higher-order functions, which is a concept from functional programming. Scala takes object-oriented principles to enable additional abstractions that can help with modularity. Akka implements encapsulation, an idea from object-oriented programming, to the bone. You never interact directly with an entity, rather send messages through proxies and hope that the entity that receives the message changes its state.
Looking back at Docker, its real power, in my opinion, is that it allows systems written in very different technologies to communicate with each other. I remember, I was working for about a month or two when we needed to collect some data periodically. The thing was, we had AIrflow, whose operators are written in Python, and we had an existing app for making API requests in JavaScript. As a proof of concept, I spinned up Airflow and used one of the operators to start a Docker container running the JavaScript app from Python. (Note that I almost turned down the opportunity that led me to learn Docker.)
But, the underlying idea is again message passing. Send messages to the system to spin up a Docker container, and then to the application running inside.
As I probably already mentioned, programming is really based on the concepts that are often commonly shared. Therefore, learning any language or other technology might be beneficial if you start thinking about the broader implications.
Let’s say you’re at University and you’re learning about TCP. Why should you be interested in how TCP works? You’re (probably) never going to actually implement it from scratch. Well, it showcases several useful ideas:
You’re actually looking at messaging between two systems;
Messages can be lost over the network – this can happen in your systems too;
There is a mechanism that ensures delivery – this design is something you may need to implement one day for some other use-case/technology;
The tradeoff that is inevitable to ensure accuracy – tradeoffs are common in IT.
Also, I once spent 8 hours debugging a program, and I found the trick was to lower the package size 🤷.
What’s the catch?
You need to like what you do. Like to learn new things, to try out new design approaches, and to improve yourself and your solutions. Not to give up when things don’t work and push forward. Not be happy with just doing your job and aspire to improve. Accept that you may spend hours to find a bug that has a one line of code fix. But know that once something actually works, you’ll get a nice sense of achievement. Don’t learn a single technology, either backend or frontend, you need to understand both, you need to know the basics of Docker. You need to know some things that you might not find interesting.
Be prepared to fail – many times. I’ve been programming almost every day for the last 10 years. I’ve failed a million times. In just the last year and a half I’ve ruined some 6-7 thousand Google ads that we had, I’ve broken some data pipelines, I broke the creation of marketing campaigns, and I still don’t fully understand every technology that we use. Just a month ago I accidentally wasted the entire budget (< 100 dollars) we had for a third-party API while I was testing. It’s not a lot of money, but we had to wait a week for extra credits to be purchased.
Well, fuck-ups happen, the most important thing is to accept responsibility and learn from them, and do what you can to make sure that they don’t happen to someone else. The result of some of my fuck-ups were code improvements or documents that detailed what went wrong so that someone else won’t repeat the same mistakes.
And sometimes you don’t know how much you’ve learned until you take a look back.