Category Archives: Programming

7 Key Characteristics of a Good Staff Engineer

If you are wondering, what does a staff engineer do? OR what exactly are different areas of work that staff engineers can contribute to? OR what are the top characteristics of a good staff engineer? Then you are at the right place.

In this post, I cover the 7 key characteristics of a good staff engineer. If you are a senior engineer and wondering how you can get promoted to a staff engineer, I share some details that can help you in your next promotion plan.

What is a Staff Engineer?

Different companies have different ladders for software engineers. In my experience, I have seen

  • software engineer => senior software engineer => principal software engineer => architect
  • software engineer => senior software engineer => staff software engineer => principal software engineer

Staff Engineer is an individual contribution ladder role. On the management path, it can resemble to Engineering Manager. However, the duties of a staff engineer and that of an engineering manager are different.

Irrespective of what the ladder is, a staff engineer is a senior role where an engineer helps a team and other engineers to lead on various projects.

The difference between a senior engineer and a staff engineer lies in the impact they create on the team and projects. A senior engineer can lead a single project while a staff engineer can be part of multiple projects.

Let’s dive into the 7 key characteristics of a good staff engineer.

7 Characteristics of Staff Engineer

1. Stay Curious

Irrespective of your impact or role, you should always stay curious. With evolving technology and stacks, it helps to stay curious and find different solutions to either the existing problems or to the new problems.

2. Always Be Learning

Stay humble. Irrespective of how much experience you have, you should always be learning. Learn from your seniors, from your juniors, from peers, and other engineers. Learn from every source of information you read.

The quickest way to grow is to keep learning. And the quickest way to stop learning is to think that you have all the answers.

Not knowing something is a strength and that keeps the door open to learn that thing. The smartest staff engineers are always learning. They are aware that technology will keep changing and they have to keep learning.

3. Company First, Team First

Staff Engineers can have a huge impact on a company’s growth. In early-stage startups, staff engineers form the engineering culture, create and build processes, and build the knowledge base.

It would be selfish of engineers to think for only their careers. Nevertheless, this is especially important for Staff Engineers. They have to set up the team for success. They have to set up the company for success.

4. Know the system

A good staff engineer will know the big picture of the system they are operating in. They will have end-to-end knowledge of the services they are working or managing. System design should be one of the strengths of a staff engineer. You can argue that staff engineers need to know the entire architecture of the platform that the company runs. But, it is not necessary. Having a big picture of the platform is good enough to figure out risks.

A good staff engineer can also foresee the possible risks as the system goes through various stages of scale and growth.

5. Provide honest feedback

Working with different skillset engineers will help staff engineers figure out the strengths of those engineers. In return, staff engineers have to provide honest feedback to those engineers to help them accelerate their careers.

The staff engineer is an ally for the manager and the rest of the team. Staff engineer fosters a bond between them and sets both parties up for success.

6. Make your team 10x

Delegation is one of the hardest skills to learn for an engineer when they know they can build something. But it is also equally important to help other engineers grow. A good staff engineer not only helps in various project activities but also helps the rest of the team to 10x their output.

Providing knowledge, training, skills, and feedback, a good staff engineer helps the team to do better and better.

7. Action First

In every project, there will be a time when there is not enough clarity about requirements. And even if there is, engineers who are coordinating on the project are stuck at making a decision. Everybody is still figuring out the trade-offs. In such cases, a good staff engineer leads the team for action.

A good staff engineer prioritizes the action for the project so the team can keep moving and not get stuck around various decisions to be made. A good staff engineer needs to make some bold calls to keep moving the needle and in the process lead the team for further progress.

Conclusion

In this post, I showed the 7 key characteristics a staff engineer can have. Not every staff engineer will have all these characteristics.

Figuring out your strengths as a staff engineer is what you need to do as a staff engineer. Most staff engineers I know have a combination of these characteristics and skills.

Three Dimensions of Product Building

In this post, I want to talk about 3 dimensions every engineer can consider while building a good product.

If a product exists, it must solve a pain, or a problem OR at least elevate the experience of the user if there was never a problem before. A good example of this is the iPod.

In short, a product must do

  • save time for a user
  • save money for a user
  • make money for a user

Let’s look at these dimensions in detail now

1. Save time

Most users will pay for your product if it will save them time. Time is money and it plays a role when it comes to software products. If a user needs a solution for a problem, they can look at various aspects. But they also look at if the solution can save them time. That’s why we have reduced the usage of Mainframe around the world. With advanced hardware, building performant software has become easier.

Depending on the problem a consumer is facing, the product can also improve the user experience. From a suite of productivity apps to convenience apps, all help us to save time. Uber helps us get a taxi in a short time without leaving our house. Doordash delivers food without wasting our time in commute.

Time is money. Save time, save money.

2. Save Money

If you exclude luxury items, most people love to save money on many things they buy. A great example is Amazon Shopping. You can buy anything on Amazon and there are multiple options available to users. Users can choose any option that can save money to them.

Cheap flights, cheap tickets for the show, cheap houses. You name it. People love to save money. With the current market of inflation, everyone would love to save some money.

Black Friday is a shopping holiday and people spend thousands of dollars to save some money on things they want.

If your product can save money for your users, and for your company, then it is worth building that product.

You can use this dimension to decide what features to build, and what bugs to fix.

3. Make Money

I intentionally kept this dimension of product building at last. At the end, who does not like to make more money? If your product can make money for others, you will be at the top of a pyramid of wanted products.

This dimension has two aspects to it. One to make money for your users and one to make money for yourself if you are the product builder.

If you build a product that solves a pain for users, they will pay you to use that product. Can your product also make money for your users in the process? Can you leverage your product in a way that can open another stream of income for yourself other than subscription users?

A lot of fintech products become custodians of their customers’ money and solve various problems for customers from payment processors, invoicing, taxes, purchase parity, etc. But they also make money by keeping that money for their customers. Stripe is a good example of this.

Other Aspects of Product Building

What about user experience? What about user ease of product usage? And what about the security of the product?

Yes, yes and yes. They are all important and necessary to build over some time. If you ask the question Why are you building something or why are you fixing something, you will find an answer that aligns with these 3 dimensions of product building.

User experience will enhance product usage for users and in turn, users can pay money to use the product.

Security will help to avoid future risks from hackers and save you time and money.

Next time, you build a feature, a fix or an entire product, look at every task with these 3 dimensions in mind. If you can’t find the right reason immediately, you can put that feature or fix it on the back burner.

Conclusion

In this post, I shared the three dimensions of product building. These 3 dimensions are fundamental block for products. As a product engineer, if you look at these fundamental blocks and solve your next problem, you will be able to a build a product that people want.

How To Use Event Emitter Technique in NestJS App

In this post, I demonstrate how to use an event-emitter technique in a NestJS application. This particular technique is part of the NestJS framework. But you can learn about the fundamentals of event-driven architecture and event-sourcing in microservices.

Introduction

Events-based architecture is helping to build scalable applications. The major advantage of this architecture pattern is there is a source and destination of events, aka publisher and subscriber. Subscriber processes the events asynchronously.  Frameworks like NestJS provide techniques for emitting the event without any other overhead. Let’s dive into this event-emitter technique.

Event Emitter Technique

NestJS Documentation says “Event Emitter package (@nestjs/event-emitter) provides a simple observer implementation, allowing you to subscribe and listen for various events that occur in your application”

Using this technique will allow you to publish an event within your service/application and let another part of the service use that event for further processing. In this post, I will demonstrate this

  • client uploads a file
  • an event is emitted for a file uploaded
  • another service processes that file based on that event.

Now, the question arises when you want to use this technique and if there are any drawbacks to using this technique. Let’s look into that further.

When To Use

In an application, if you want to do some CPU-heavy OR data-intensive work, you can think of doing that asynchronously.  How do we start this asynchronous work though? That’s when event-emitter comes into the picture.

One part of your application will emit an event and another part will listen to that event. The listener will process that event and perform the downstream work.

To understand this better, let’s look at an example.

Demo

Controller –

We have a simple REST controller to upload a file.

@Controller('/v1/api/fileUpload')
export class FileController {
    constructor(private fileUploadService: FileUploadService) {}

    @Post()
    @UseInterceptors(FileInterceptor('file'))
    async uploadFile(@UploadedFile() file: Express.Multer.File): Promise {
        const uploadedFile = await this.fileUploadService.uploadFile(file.buffer, file.originalname);
        console.log('File has been uploaded,', uploadedFile.fileName);        
    }

}

This controller uses another NestJS technique Interceptor that we have previously seen in this post.

Register the Event Emitter Module –

To use an event emitter, we first register the module for the same in our application module. It will look like below:

EventEmitterModule.forRoot()

Service –

Our controller uses a fileUploadService to upload the file to AWS S3 bucket.  Nevertheless, this service will also emit the event after the file has been uploaded.


@Injectable()
export class FileUploadService {
    constructor(private prismaService: PrismaService,
        private readonly configService: ConfigService,
        private eventEmitter: EventEmitter2){}
    
    async uploadFile(dataBuffer: Buffer, fileName: string): Promise {
        const s3 = new S3();
        const uploadResult = await s3.upload({
            Bucket: this.configService.get('AWS_BUCKET_NAME'),
            Body: dataBuffer,
            Key: `${uuid()}-${fileName}`,
        }).promise();

        const fileStorageInDB = ({
            fileName: fileName,
            fileUrl: uploadResult.Location,
            key: uploadResult.Key,
        });

        const filestored = await this.prismaService.fileEntity.create({
            data: fileStorageInDB
        });

        const fileUploadedEvent = new FileUploadedEvent();
        fileUploadedEvent.fileEntityId = filestored.id;
        fileUploadedEvent.fileName = filestored.fileName;
        fileUploadedEvent.fileUrl = filestored.fileUrl;

        const internalEventData = ({
            eventName: 'user.fileUploaded',
            eventStatus: EventStatus.PENDING,
            eventPayload: JSON.stringify(fileUploadedEvent),
        });

        const internalEventCreated = await this.prismaService.internalEvent.create({
            data: internalEventData
        });
        fileUploadedEvent.id = internalEventCreated.id;

        if (internalEventCreated) {
            console.log('Publishing an internal event');
            const emitted = this.eventEmitter.emit(
                'user.fileUploaded',
                fileUploadedEvent
            );
            if (emitted) {
                console.log('Event emitted');
            }
        }

        return filestored;
    }
}

This service code is doing a few things.

  • Upload a file to AWS S3
  • Store the internal event user.fileUploaded
  • Emit that event

One reason we are storing that event in the database is to know exactly when the event was emitted and if we ever want to reprocess the same event if we can emit that.

Anyhow, we are using EventEmitter2 class in our constructor. To be able to use this class, we should make sure we install @nestjs/event-emitter dependency in our Nest project.

Event Listener –

We have a service that is emitting the event, but we will also need a listener service that will process the emitted event. So let’s write our listener service.

Our event class looks like this:

export class FileUploadedEvent {
    id: string;
    fileName: string;
    fileUrl: string;
    fileEntityId: number;
}

And our listener class will look like below:

@Injectable()
export class FileUploadedListener {

  constructor(private prismaService: PrismaService){}

  @OnEvent('user.fileUploaded')
  async handleFileUploadedEvent(event: FileUploadedEvent) {
    
    console.log('File has been uploaded');
    console.log(event);
    await this.prismaService.internalEvent.update(
        {
            where: {
                id: event.id,
            },
            data: {
                eventStatus: EventStatus.PROCESSED,
            }
        }
    );
    console.log('File will get processed');

  }
}

To be able to listen to emitted events, we use annotation @OnEvent. Just like this listener, we can add more listeners for the same event and each listener might do its own set of work.

Now, if we run our application and upload a file,  an event will be emitted and listened on. We can see the database entries for the event.

Conclusion

In this post, I showed how to use the Event Emitter technique in a NestJS Application.

Why Observability Matters in System Design

1. Introduction

In this post, I want to talk about observability and how to design a system with observability. Observability just means watching and understanding internal state of a system through external outputs. But what exactly are we watching when it comes to systems or microservices? We’ll dig into that. In one of my previous posts, I have talked about observability by collecting metrics.

Often, we build a system backward. We start with an idea and make a product for people to use. We hope everything works smoothly, but it rarely does. There are always problems, challenges, and the product might not work as users expect. Users might say your product isn’t working, get frustrated, and leave. In the end, it just becomes another not-so-great product.

When users report problems, an engineer has to figure out why it happened. Having logs that show what went wrong can be really helpful. But in these situations, we’re reacting—the system is already doing something, and we’re waiting for it to mess up. If someone else has to understand our system without knowing all the technical stuff, it’s tough for them to figure out what’s happening. They might have to use the product to really get what’s going on. Is there a better way? Can we design our system so anyone can understand how it’s doing without diving into all the complicated details?

How do we design a system that can allow us to observe it? Can we have a mechanism to alert an issue before a customer comes to us? How can observability help us here?

2. Components of Observability

There are 4 components to Observability:

  • Log Aggregation
  • Metrics Collection
  • Distributed Tracing
  • Alerting

Each of these 4 components of observability is a building block for reliable system design.

Anyhow, let’s look at them.

2.1 Log Aggregation

Whether you are using Microservices or Monoliths for your system, there is some communication between frontend and backend OR backend to backend. The data flows from one system to other following some business logic. Logs are the key to help us understand how the production system is working.

While designing any system, we should form a convention on how we want to log the information. Logging right data should be a prerequisite to building any application. Every system design comes with its own set of challenges, log aggregation for the system is the least of the challenge. Having log aggregation will help in long term. How you implement log aggregation can depend on variety of factors like the tech stack, cloud provider. With modern tools, it has become easy to query aggregated logs.

When you start building your system, adopt a convention on how and what to log from your application. One caution to take into account is to not log any personal identifying information.

2.2 Metrics Collection

Whenever your system will start to mature, you will notice the trends of resources it has been using. You might face issues where you might have allocated less space or memory. Then on emergency basic, you will need to increase memory or space. Even in that scenario, you might have to make a judgement call by how much to increase.

This is where metrics collection comes into picture. If we build a system that can provides us metrics like latency, Memory/CPU usage, disk usage, read/write ops, it can help us for capacity planning. Also, it will provide more accurate information on how we want to scale if we start to see heavy load.

Metrics can also allow us to detect anomalies in our system. With monitoring tools available, it has become easier to collect metrics. Datadog, Splunk, New Relic are few of the observability tools.

Other than system metrics, one can also build product metrics using such tools for metrics collection. Product metrics can allow to see how your users have been using or not using your application, what does your product lack? It is one of the ways to gather feedback of your own product and improve on it.

2.3 Distributed Tracing

Overall, the microservices for various systems are getting complex.  All microservices within a single system talk to each other and process some data. It would be nice if there was a way to trace each call that comes from the client to these services. That’s where distributed tracing comes into picture.

Distributed tracing is capturing activity within a local thread using span. Eventually, collecting all these spans at one central place. All these spans are linked to each other through a common correlation id OR trace id.

The advantage of distributed tracing is to be able to see any issue or anomaly of the microservice. This allows us to trace an end-to-end call.

2.4 Alerting

At the beginning of this post, I mentioned how users of our product can get frustrated and bring to our attention various issues. In many such cases, we are reactive. Once the user reports an issue, we investigate why it happened and how it can not happen again.

With distributed tracing and metrics, can we create alerts to inform us about any issues in our application? Yes, that’s where the alerting comes into picture. We can also collect metrics related to errors OR gather errors data through distributed tracing and logs.

We also want to be careful where we don’t want to alert too much that it becomes a noise. Having some threshold around errors helps. Anomaly alerts can be handy. At the end, it all depends on your application and all the observability data that you will gather.

3. System Design with Observability

We talked about the key components around observability. We are aware of designing a system with certain requirements and industry practices. How do we build this system with observability in mind? Setting certain service level agreements (SLA) and service level objectives (SLO) before building your application can direct engineers. Engineers can then build observability as they build the application instead of coming at it later. This will be more proactive approach. In many microservices, we can also integrate with opentelemetry. Opentelemetry is a vendor neutral observability framework.

4. Conclusion

In this post, I discussed monitoring, observability and how an engineer can think about these while building a system. To be more proactive with these practices, system design should take observability into account.

Worker Pattern in Microservices

In this post, we will discuss a new design pattern – the worker pattern and how to use this pattern in microservices. Previously, I have covered communication and event-driven patterns in microservices.

What is a Worker Pattern?

Let’s start with a scenario. You have a microservice that processes business logic. For a new set of business requirements, you need to build a new feature in microservice. This feature will be resource-intensive for processing a large set of data. One way to handle any resource-intensive processing of data is to do that processing asynchronously. There are different ways you can implement this for asynchronous handling. You can create a job and queue it in a queue system. OR you can publish an event for another service to process the data. Irrespective of your approach, you will need a way to control the flow of this data.

With either approach, you want to avoid your main service getting blocked for processing these asynchronous events. Let’s assume for a second if you went with a job approach, you can use Bull Queue (redis-based queuing mechanism).

Bull Queue offers a processor for each job that your service will add to the queue. Nevertheless, this processor runs in its own process and can perform the job. For resource-intensive operations, this can still have a bottleneck and can probably stop performing the way you wanted.

In such cases, you can create a standalone worker. This standalone worker will run on its own set of resources (like Kubernetes pod). This worker will process the same job that your service added to the queue.

When to use Worker Pattern?

Every use case is different. But the simple heuristic you can use to check how much CPU-heavy work the processor plans to do. If the job from the queue is CPU-heavy, then create a separate processor or worker. In the same service, you can write a separate processor that will execute the job in a separate process. It is also known as Sandbox Processor.

Worker will spin up the standalone service in its own resources and execute the processor. It will not interfere with other processes from the service since it is executing one job. Another advantage of worker patterns is to be able to scale the resources horizontally. If you use a sandbox processor, you might have to scale the services, and all scaled-up resources are equally divided into processes.

Example of Worker Pattern

Let’s look at a quick example of a worker pattern.

We have a simple endpoint that will add a job to a queue. This endpoint is part of a controller in a microservice.  This will look like something like below:

 export class JobController {
  constructor(@InjectQueue('file-queue') private queue: Queue) {}

  async createFileProcessingJob(): Promise {
    const job = await this.queue.add('process-file', data);
  }
}

Now, this controller is part of  app.module.ts, we will also have to register Redis for the bull queue.


@Module({
  imports: [
    BullModule.registerQueue({
      name: 'worker-pattern-queue',
      redis: {
        host: 'localhost',
        port: 6379,
      },
    }),
  ],
})
export class AppModule {}

To create a standalone worker, we can create a separate module and use that module to create a new application context. For example, we create a new module subscriber.module.ts and it has our consumer (worker/processor) as a provider.

 
@Processor('worker-pattern-queue')
export class FileQueueProcessor {

}

The module will look like

 @Module({
  imports: [
    BullModule.registerQueue({
      name: 'worker-pattern-queue',
      redis: {
        host: 'localhost',
        port: 6379,
      },
    }),
  ],
   providers: [
     FileQueueProcessor  
]
})
export class SubscriberModule {}

And create a separate folder with main.ts. This should include

 const app = await NestFactory.createApplicationContext(SubscriberModule);
await app.init();

Now you can run this main.ts as a separate worker in a docker container in Kubernetes pod. You can scale it horizontally as well.

Conclusion

In this post, I showed how to use worker pattern in microservices to handle certain requirements.

Not related to this post, but my previous post was about handling flakiness.