How we organized our development department.

As of today (October 26, 2022), our beautiful software development department is made up of 32 pretty incredible people:

22 software developers
2 quality assurance specialists
4 product designers
4 product managers

It's starting to get a bit crowded, and we can't really afford to let team structures form organically like we did back in our early days. So, like we’ve had to do more than a few times since oxio started growing, we got out the drawing board and put our brain juices to work on this very human and also organizational challenge: How do we organize a development department? Heads up, a single answer just doesn’t exist. It absolutely depends on what you want to optimize. Let's take a closer look.

TL;DR

We structured our software development department to optimize the productivity, accountability and autonomy of its members.
The teams (squads) are multi-disciplinary, defined around a business domain and have a very weak coupling with the other squads.
All developers are full-stack (frontend + backend + cloud infrastructure), to promote their productivity, empowerment and autonomy.
All our squads are placed into 1 of 4 categories: stream-aligned, enabling, complex subsystem and platform
Each squad is independent in its technological choices (to promote autonomy), but code reuse and knowledge sharing is encouraged (to promote productivity).
The structure of the teams is modelled on the desired software architecture, and vice versa, to form synergy.
In the long term, squads with a common mission will be grouped into a tribe and will share common rituals and meetings to stay aligned on the vision.

The resource optimization problem

When you only have a hammer, everything looks an awful lot like a nail.

At oxio, we’re fundamentally obsessed with optimization. So it’s natural that we attacked the organization of our software development department (dev department) as a resource optimization problem. You don't change a winning formula. In all honesty, I don't believe that we can change this aspect of our personality anyway, but that’s a chat for another time.

In short, we went back to basics, aka: what metrics do we want to optimize? I’m really confident about our answer:

Productivity
Accountability
Autonomy

You could take my word for it, but let me explain the rationale behind these choices just in case you don’t trust me just yet.

Productivity

We could argue (and do) that productivity, or more specifically the software release rate, is one of oxio's greatest competitive advantages. Speed as a habit is a recipe that the established big players probably can’t replicate.¹ The ability to quickly move from ideation to production promotes not only sustained innovation, but also rapid validation of business hypotheses. This dynamic is possible because of the inherent nature of a start-up: not many employees, lean organizational structure, high employee autonomy, bare-boned processes, performance culture, etc. We can’t let oxio’s growth hinder this non-stop creation of value for our customers. With that in mind, the objective here is to hold on to the advantages of a start-up, even when we can no longer consider ourself one.

First assumption: our team structure must promote the productivity of its members and participate in maintaining it over the long term.

Accountability

We take the accountability of team members very seriously at oxio. We constantly hammer our colleagues with the same message: absolutely everything must have a single—aka one—owner! Whether it’s data, a feature, a system, a product or a team, the single owner must always be clearly and easily identifiable. That means accountability isn’t limited to team members in leadership positions; it’s a requirement for everyone at oxio.

The positive consequences of a single owner are tangible and varied. For example, removing ambiguity about who’s responsible greatly simplifies and concentrates communication, both within dev squads and with other departments. Assigning a single owner to each aspect also ensures rigorous monitoring of its development and release. Accountability clarifies who’s responsible for ensuring the quality and maintenance of a feature or system throughout its lifecycle.

Second assumption: we need to encourage a culture where each individual feels responsible and proud to lead part of the company's products. The structure of our teams has to favour this culture.

Autonomy

Autonomy is first and foremost a powerful source of motivation. Regardless of our position, it's super motivating to know that we can do our job with as much freedom and as little friction as possible. In the context of work, this autonomy is often a prerequisite for the curious and daring to take ambitious initiatives. It gives us the room to innovate. (Shout out to No Rules Rules.²) Regardless of role or level of leadership, it’s the person with the most context who can freely and confidently make their own decisions. Being autonomous in your work also means not having to ask your boss for permission for each initiative that you deem good for the success of your project.

We understand that offering this high level of autonomy can only work if there are two other underlying elements:

solid confidence in the judgment and execution capacity of team members and
an organization that values taking calculated risks.

There’s also a strong link between the autonomy of a squad and its members and the speed at which they execute their projects. The more autonomy a squad has, the more likely it is that they’ll be able to carry out an end-to-end initiative in a cohesive and efficient way. So, the more dependencies a team has on other squads, departments or processes, the more pitfalls and blockers it will encounter in carrying out its tasks.

Third assumption: our squad structure has to promote the autonomy of each squad and team member.

Squad structure

Once we figured out what we wanted to encourage or optimize within our squads, we needed to find the best way to make it happen. When it comes to building our squads, our philosophy has been influenced by two industry structures—one pretty well-known and one soon-to-be-well-known: Spotify Squads³ (we know, we know, we’ve all seen the video) and Team Topologies⁴ (a well-developed but lesser-known method). We kind of stole what we liked from each structure and landed on a combo adapted and optimized for oxio. And, like everything at oxio, it’ll keep evolving iteratively as we grow and learn.

Team members

Let's say we’ve got a nice little gang of 15-20 developers. We need to separate them into squads in a way that promotes productivity, accountability and autonomy. Ima start with an unpopular opinion: we never want to separate team members into functional silos. For example, if we’re trying to increase productivity, accountability and autonomy, separating our team members into three squads—a squad of frontend developers, another of backend developers and a third focused on cloud infrastructure—would be a pretty horrible idea!

This separation into silos really slows down productivity. Why? Because it involves passing the torch to 3 different squads before bringing a feature into production. Each squad can be super productive in their silo, but what ultimately matters is the rate at which features are released. Second, these teams probably have very different timelines, priorities, and incentives, so it’s all going to need an extra dose of project management to keep it all in sync and bring each initiative to fruition. Because these three squads are dependent on each other, none of them are truly autonomous to carry out the initiative. Third, we also understand that accountability becomes extremely difficult to enforce. Because the development, implementation, maintenance and debugging of features depends on several teams, no one is really responsible for the proper functioning of the entire system.

Some companies try to solve this problem by creating more functional silos (QA, production support, operations, etc.), but this is done at the expense of a squad’s autonomy and productivity.

Okay, so, we don't do that. What we do do is identify the fracture planes that make it possible to segment team members into several squads, ensure strong cohesion within each squad and weak coupling with other squads. Several squads might be needed to achieve this goal. Think business domains, types of end users, compliance requirements, etc., but we really try to favour fragmentation by domain.

For example, our billing squad is responsible for everything related to the billing system and is 100% autonomous to manage its mission. They’re the only ones who can design, develop, test, release, maintain, and support the functionalities of that system and any dependence on external squads is kept to a minimum. When communication with other squads is necessary, everything is ideally done asynchronously to minimize blockers. For example, we favor communicating instantly by Slack rather than organizing a synchronous meeting. These teams also develop cutting-edge expertise in their field and become extremely productive because they swim in the same waters for an extended period.

Another legit advantage of segmentation by business domain is that it ensures a lower mental load than if the segmentation was by technical expertise. For example, a member of the inventory squad doesn't have to worry about how cryptocurrency payments are supported, while a frontend/backend split might require someone to master both areas.

I'll give you a concrete example to illustrate all this. Let’s say we want to build a feature that allows you to diagnose the quality of oxio's internet service for a specific postal code and display the result in a web application intended for technical support agents. Dominic, a kind, talented, renowned developer, initiates the task at the backend level by creating the necessary APIs. Once that’s done, we would need to tackle the frontend and infrastructure. The structure of the squads and their level of autonomy will have a significant impact on what happens next.

Let’s take a look at a few potential scenarios:

Dominic is on a backend dev team, so he passes the puck (sorry not sorry for the sports metaphor) to the frontend dev squad so they can get the feature started. He does the same thing with the infrastructure squad. There are three possible outcomes in this scenario:
- The frontend squad and/or the infrastructure squad are not directly available to initiate their part of the feature, but will prioritize it shortly
  ⇒ 🚫 Life is short and your customers will go elsewhere.
- The frontend squad and the infrastructure squad can both start working on the task. It’s apparently not a problem since they don’t have much to do anyway
  ⇒ 🚫 Misuse of resources, imminent bankruptcy is likely!
- The frontend squad and the infrastructure squad can start working on their respective tasks shortly because there has been prior synchronization between the timelines of the three squads and there has been no delay on one side or the other
  ⇒ 😅 So, you got lucky. But, what are the costs and impacts of this inter-team synchronization? I’m guessing a lot.
Dominic is in the network diagnostics squad, so he passes the puck to his colleague Samuel who, I’ll admit, is pretty smart when it comes to frontend, and his colleague Simon who is good (okay amazing) when it comes to infrastructure. There are two possible outcomes here:
- This task was already planned in the diagnostic team sprint, so Samuel and Simon are already qualified to complete their respective part of the task. They are also kept informed each morning of Dominic's progress and can align their flutes (I made up for the sport metaphor with a … music metaphor) accordingly and even get a head start.
  ⇒ ✅ Let's go team, keep it up.
- One of the three stakeholders has fallen behind and the synchronization of the three parts of the task is not optimal
  ⇒ 😥 It happens. It’s not ideal. At least the task has been prioritized within the team sprint and their communication is smooth, so it should unlock soon(ish).
Dominic is a full-stack developer on the Network Diagnostics squad and he knows how to do frontend, backend and infrastructure, and has no pucks to pass on to anyone.
- This task was planned in the sprint, and Dominic can tackle it from start to finish without having to synchronize with his colleagues, let alone with an external squad.
  ⇒ ✅🧠 Dominic is the hero we need.

So far, I’ve only talked about the influences of autonomy and productivity on squad structure. But it’s a similar exercise when you toss in accountability. If the task is developed by three squads, how is the responsibility shared? If it’s developed by three different people from the same squad, is the chain of responsibility a little clearer?

In the third scenario, Dominic is clearly responsible. There is no ambiguity.

Up next: what exactly are these squads made of? What roles are necessary for their proper functioning? The short answer: all the roles necessary for the team to be autonomous and productive! The longer, better answer:

Squad composition

There are several stakeholders involved in the development of our products and services: software developers, quality assurance specialists, product designers and product managers. All these stakeholders must be part of the same squad in order to share the same mission, deadlines and incentives. It’s the combination of a squad's multi-disciplinary skills that make it autonomous and effective. Because each feature or system is part of a single squad, that squad can be responsible for the entire life cycle of a project. No, none, zero ambiguity.

Okay, so, I’ve got another unpopular opinion. This one might even be a rule: we only hire full-stack developers. This shouldn’t be all that surprising at this point—it's a way to maximize developer productivity, accountability, and autonomy. A developer who’s efficient in frontend, backend and infrastructure will be able to carry out an end-to-end task independently, without ever having to transfer the responsibility to a fellow developer. Imagine never having to wait for a backend developer to implement a specific API, or for an infra developer to support a new cloud service? You just do it. This level of autonomy plays an big enormous role in productivity. Oh, and because of the realities of oxio’s remote work culture, our developers often find themselves in different time zones. So, being full-stack isn’t just practical, it’s almost a necessity.

We also follow the same full-stack philosophy for our other roles within the department. Product managers act as product owners, our quality assurance specialists carry out manual and automated tests, and our product designers work as much on the interface as on the user experience. We’re aware that it’s going to get harder and harder to maintain this unspoken rule as the size of the dev department increases, but we’ll keep actively working to maintain this dynamic for as long as we can.

Let’s chat about something a little more concrete for a minute. What’s the optimal squad size? Well, that depends on the complexity of the squad’s mission and the mental load necessary to encompass it. On our side, we aim for smaller teams of 6-7 people that include:

3 or 4 full-stack developers (including a team lead)
1 quality assurance specialist
1 product designer
1 product manager

One small, teeny-tiny detail: team members with roles that have lighter mental loads can be members of multiple squads—if their workload allows it and the squads share a similar mission. In a perfect world, they should be assigned to squads with a strong cohesion so as to minimize their mental load and maximize their productivity. For example, it makes sense for one of our product designers to be part of both the Inventory and Product Catalog squads, because these two systems are essentially configuration modules dedicated to administrators and have more than a few similarities.

We keep our squads deliberately small. This forces us to choose initiatives that keep us focused on the priorities that contribute to our business objectives. Life is just too short to be spending time and energy on features that do not directly support the success of our products. But but but, the small size of our squads can be demanding. To minimize the mental load for each squad, we put a lot of effort into thinking about our business domain segmentations. Otherwise, a squad could find itself collecting several different business domains over time. And this would require more and more specialized knowledge, changes of focus and ultimately impact the mental load of its team members. To wrap it all up, after testing a few different configurations and sizes, limiting the number of developers led to increased productivity, mainly caused by team members developing a single expertise. The cozy size of our squads also forces us to optimize our work processes, our communication and our technological choices. Turns out necessity really is the mother of invention after all.

Horizontals

As quality assurance specialists, designers and product managers are assigned squads that ensure their autonomy. This means that they often spend a lot of time alone with their expertise. Being alone makes it hard for them to share their learnings and pool best practices.

We’ve solved this by encouraging specialists in the same role to schedule recurring horizontal meets. These meets give them a time and place to chat about their issues and produce documentation and tools centered around their expertise. Knowledge sharing can also boost the productivity of specialists who, as you might know, are constantly updating their toolbox.

The Team (Squad) Lead

If you’ve made it this far, reading this shouldn’t be a surprise: the responsibility of managing a squad will end up in the hands of one person. Even if each specialist (dev, product, design, QA) manages the evolution of their expertise within the team, it’s ultimately the role of the lead to ensure that harmony reigns between the members and that the squad is kicking ass when it comes to performance. Okay, this part might be new: the squad lead is also a senior developer. And that means they also act as a technical reference (tech lead) for their squad. And and and, you guessed it: having the same person assigned to the personnel management and technical direction of the squad is really only sustainable if the dev squad remains small.

The (pretty intense) responsibilities of the squad lead are:

Do full-stack software development for at least 50% of their time. (What? They’re still a senior developer, and we need all the help we can get.)
Ensure everyone gets along.
Make tough technical decisions that no one else can or wants to make.
Coach junior devs.
Do one-on-ones with every team member, every week.
Assess the performance of team members.
Track and present the metrics associated with the squad’s mission, on a recurring basis.
Participate in the hiring process.
Choose the rituals and processes adapted to the reality of their squad.
Support product managers in defining and improving the product.

So being a squad lead takes someone with very, very good human and personnel management skills and they need to be an experienced and super talented technical leader—aka: not easy to find.

Team topologies

Based on the excellent book Team Topologies—seriously, it’s worth a read—there are four different, you guessed it, team topologies.⁴ This philosophy of team organization is mainly oriented towards optimizing the productivity of dev teams, but also touches on autonomy, accountability and personal development. And at the center of it all are the focus teams, the three other topologies are there to support them.

Team Topologies offers an interesting exercise to help assess the mental load of the teams: assign a complexity rating to each project between 1 and 3 and, taking into account all their projects, aim for each team to accumulate no more than 3 complexity points. It’s that simple.

All that to say that, at oxio, we’ve drawn a lot of inspiration from these topologies to build our own squads. It is an interesting (and proven) foundation on which we can iterate and calmly adapt to our culture. Alright, here are the four main topologies.

Stream-aligned (focus)

Focus teams squads should, by far, be the most numerous squads in a modern software department. They work on a single workflow (product, system, etc.) and are responsible for quickly and independently producing a ton of value for customers. These squads must be autonomous on every aspect of the product for its entire life cycle, such as: security, infrastructure, user experience, marketing, etc. All other team topologies are used to help focus squads be as productive as possible.

For example, our squad in charge of developing the customer management system is a focus squad because its mission is to develop a product that directly generates value for our consumers. This team is supported by other squads, in particular by the squad that manages the Notifications service which makes tools to facilitate the work the customer management system squad is doing.

Enabling

Enabling teams squads are essentially used to coach focus squads on certain aspects and relieve them of certain responsibilities. Again, the goal here is to make the focus squad as productive as possible. An enabling squad is made up of very senior members who are excellent communicators and is expected to be more homogeneous than multi-disciplinary. These squads exist temporarily—they stick around just long enough to fulfill their specific coaching mission and are then disbanded.

For example, an enabling squad can relieve a focus squad by doing proofs of concept for them, helping them with their technological choices or producing documentation. They can also help improve knowledge with coaching in more specialized areas (machine learning, security, blockchain, etc.), to prevent the focus squad from changing its focus.

Complicated subsystem

A complicated subsystem team squad is made up of specialists responsible for building and maintaining a part of a system that’s been deemed pretty friggin’ complex—aka complex enough that the other squads would have to stop what they’re doing and learn how it works. We don’t want that, so the complicated subsystem squad takes over. The goal here is to limit the mental load of the focus squad by allowing them to take advantage of this complicated subsystem squad while remaining focused on their own product.

For example, our internet squad (we know, we’re super imaginative when it comes to squad naming) is a small team of specialists and has all the expertise they need on activating and maintaining an internet connection across various Canadian networks. The internet squad breaks down all the complexity of sending requests and communicating with these networks and translates them into a simple API that the focus squads can use without asking too many questions. This team is made up of only three devs and a quality assurance specialist.

Platform

The platform teams squads help the focus squads by offering tools with a low level of abstraction that they can consume within their product. The objective is, once again, to allow focus squads to have as many tools as possible at their disposal to be as productive as possible in their development. Typically, a platform squad can develop tools that help other teams with cloud infrastructure maintenance, application monitoring, auditing, payment processing, sending communications, etc. In short, unlike the focus squads, the “customers” of the platform squads are fellow developers.

In oxio’s case, we don't have a dedicated platform squad at the moment. This is mostly because each focus squad has in-house developers with infrastructure expertise. However, we have a few microservices that were built with the intent of being used as a platform by other squads. This is the case with our Notifications service, which allows other squads to send emails and SMS in a standardized way without having to worry about internal operations.

Technological autonomy

Simultaneously optimizing for autonomy and productivity can sometimes (okay a lot of the time) be tricky to manage. For example, several teams concerned about maintaining their autonomy could develop similar tools in parallel without knowing, negatively impacting their potential. But, forcing squads to constantly synchronize and share as much code as possible can also harm their autonomy—shout out to Working Backwards.⁵ So, how the heck do we find that magical balance between reuse and freedom of technological choice.

In practice, we hold technical demos (that are recorded for the sake of asynchronization) every two weeks where developers can present something that may be useful to other squads. When there’s enough excitement for a certain tool or feature, everything is extracted into a common reusable library to avoid other squads having to reinvent the wheel. Squads are then free to use it or not, depending on their own use cases.

In the end, each squad is free to make its technological choices. If a squad wants to deviate from the technologies and common practices proposed by the department, it must, however, clearly justify its choice, preferably based on our three god-like metrics above. Okay, here they are again: productivity, accountability and autonomy.

Software architecture

If there was one thing I would like to share with my colleagues who want to structure a software development department, it’d be Conway's Law.⁶ So, Ima share it 🤓 :

The structure of the development teams greatly influences the structure of the software produced by these teams.

If that’s true (I think it is), then it’s super important to propose a structure that works in synergy with the desired software architecture. Otherwise, the structure of the teams can straight-up hinder the emergence of an architecture.

Remember that we have small squads all across Canada with strong cohesion, organized by business domain, that have a weak coupling with other teams and opt for asynchronous communication when they have to share information outside of their team.

If you’re in the know, you’ll have no problem spotting the underlying architecture: domain-driven, distributed microservices, based on an event-driven asynchronous communication architecture. Coincidence? I don’t think so.

This model has its challenges.

Other than Denis Villeneuve's Dune movie, there isn't much that's perfect in this world. Through trial and error and observing other companies develop a similar model, we’ve discovered a few challenges while developing our philosophy for departmental structure.

First, finding the right segmentations by business domain is really not as easy as it sounds. Okay it doesn’t even sound that easy. But still, it can be complicated. You need an excellent knowledge of the business domain and a good dose of trial and error to really grasp the interconnections between the different areas and understand their limits. And to top it all off, this is a pretty important step. Poor segmentation and the team may end up with too many dependencies on other teams—aka loose autonomy—or have too large or ambiguous a scope that will be difficult impossible to manage.

Next challenge: to get the most out of our model, we need to hire full-stack developers and, ideally, 50% of them need to be seniors. Hiring developers is already a ridiculous challenge for the majority of companies, so adding the qualifiers “full-stack” and “senior”, can reduce the pool of candidates to, well, not very many. That said, we’re not ready to compromise on this just yet. (The outsized advantages in terms of accountability and our solid hiring average over the past couple years make it too hard a criteria to let go of.)

There’s also the concern of selecting programming languages, tools and architectures that reinforce the desired squad structure. For example, a microservices architecture promotes the cohesion and decoupling of squads, a technology like Typescript allows the use of a single language for the frontend, the backend and the infrastructure-as-code, promoting autonomy and the hiring of full-stack developers, a tool like GraphQL Federation will centralize the APIs of each team without them needing to coordinate, etc.

Finally, communication at the departmental level is definitely DEFINITELY a challenge. With a structure that favours a smaller number of big squads, it would be easier to align everyone in terms of vision and strategy. In our case, with a lot of small, self-contained squads working on their respective business domains, there’s an increased need to present and synchronize everyone on the big picture which may be completely foreign to them.

The evolution of this structure

If the mixture of luck and good execution continues to necessitate growth in our department, we’ll need to make some changes. And those changes are:

First, squads with a strong cohesion and a similar mission will have to regroup within a tribe—thanks again Spotify Squads. Members of the tribe will share common rituals, such as synchronization meetings and technical demonstrations. This will save us all from those painful 40-person meetings. For example, future Billing + Debt Collection + Payment squads could be part of a Revenue tribe that would encompass everything related to the revenue part of the software. Before we go any further, let's take another second to appreciate the creativity behind the naming of our tribes and squads.

We’ll also have to create more horizontal meetings to effectively share best practices related to expertise (see “Chapter” on the image below). Currently, we only have horizontals for designers and product managers, but the day will come when it’ll be beneficial to do the same with the different areas of development (frontend, backend, infrastructure, data analysis, blockchain, ...). Doing this will ensure a certain cohesion and a wider sharing of knowledge.

Conclusion

There are a lot of ways to organize a software development department, its many teams squads and the different members within those teams squads. And that means that there are obviously also a lot of ways to get it all wrong—especially if the structure you want to apply isn’t clearly thought out to promote the desired behaviours.

Even if you do everything right, the behaviours will change depending on the stage of growth the company is in. And that’ll require a change in the structure of the department. In the case of oxio, our structure primarily favours execution speed, and it’s a conscious choice that gives us a clear competitive advantage in the industry. That being said, the saying “move fast, break things” which serves us very well for now could, in the long term, just stop working for us. The solution: you need to constantly adapt to the new realities of your company, and market, to remain competitive.

Okay phew. That was a lot. If you’ve got an opinion, an experience or a comment to share on this subject, write to me at olivier@oxio.ca. I live for feedback and would love to chat!

Thanks to the pretty incredible Simon Frenette for his dangerous knowledge of Conway's Law and many, many hours of brainstorming we did in California (between two games of spikeball) to shape oxio’s long-term vision. And a BIG thank you to everyone in the department for their seemingly unending willingness to test (and sometimes suffer) the many iterations and evolutions of this structure.

Written by Olivier Falardeau, Head of Engineering, while drinking a Peach Bubly and listening to Blackwater Park by Opeth.

Olivier Falardeau

Head of Engineering

Structuring our software department.