Command vs Event: It’s About Where the Logic Lives

In most cases, the decision between sync and async is straightforward.

If you must know the result immediately (e.g. payment authorization, OTP), you go synchronous.

Otherwise, you go async.

The real question is:

When using async, should we send a Command or publish an Event?

1. Async Decision: Command vs. Event

Once we choose async, we usually end up with two options:

  • Send a Command

  • Publish an Event

At a glance, they look very similar — both go through queues.

But in practice, they lead to very different systems.

A simple comparison

** 2. The Real Problem in Practice**

In real systems, this decision is rarely clean.

You often see a lot of async commands already in place, and later a push to “move to events” for better decoupling. But after the change, things don’t necessarily get simpler — just different.

Logic doesn’t disappear — it moves

What do you feel

Command:

  • Easy to follow — everything is in one place

  • Low coordination cost

  • But tends to leak logic and create coupling

Event:

  • Decoupled and extensible

  • But spreads logic across services

  • And increases coordination and debugging cost

In short:

  • Command keeps things visible but coupled

  • Event decouples but spreads the logic

3. What should drive the decision?

The real decision is not “Command or Event” in isolation.

It is a trade-off between:

  • keeping the logic centralized, at the cost of tighter coupling

  • decoupling services, at the cost of distributing the logic

So the useful question is:

What matters more in this case: visibility or decoupling?

A few factors usually drive that decision:

  • Flow visibility — Do we need one place to understand and debug the business flow?

  • Expected growth — Is this likely to attract more downstream reactions over time?

  • Coordination cost — Can the teams afford the extra schema, tracing, and ownership overhead?

  • Cost of coupling — Is the downstream a simple utility, or a separate domain we don’t want to bind to?

In practice:

  • Choose Command when keeping the flow visible and local matters more

  • Choose Event when reducing coupling and enabling future extensibility matters more

Error Handling & Logging Best Practice

Some context, when reviewing our code I feel some catch and throw actually useless\. Some logging are useless as well\. So here trying to figure out best practice\.

Catch - Throw Rules

Only use catch - throw (here only talking about re throw) when

  1. Change the business semantics. e.g. DB unique violation → ConflictError with 409

  2. HTTP status change. e.g. Turning a generic error into a specific status.

  3. Add additional info into error log.

Some anti-pattern:

1
2
3
4
5
6
7
8
9
10
11
12
// add no value, just noise. as correlationId can be found in logger already.
catch (error) {
this.logger.error('Something failed', { correlationId, error });
throw error;
}

// Some is throwing a new error type but actually no one cares about it.
// Also lose the error.cause.
catch (error) {
throw new XXXInternalError('UNIQUE_CODE');
}
// Question?: is a unique, greppable string worth the cost of a custom error class, a catch block, and lost error chain?

Logging Best Practice

1. Log at the boundary, not at every layer.

ClientService, ThirdParty. Should have logger already (with interceptor).
DB request and response logging is debug level off in prod.

2. Don’t log what’s already in the trace context

🙋‍♂️ But we can’t see it in Grafana logs.
But indeed, something has been logged automatically as rebind logger so it’s added in logger context: traceId, correlationId

3. Use the right log level

Seems very basic, but just realized I was actually using the wrong log level. I was abusing log\.info

info should be used only on Boundary. Or business outcome like card/account created

4. Log what the error does’t already tell you

N/A

5. Don’t log sensitive data

We’re using LOGGING\_MASKED\_SENSITIVE\_FIELDS to mask the PII

6. Prefer structured metadata over string interpolation

Bad:

1
this.logger.info(`Publish data for page: ${page}`);

Good:

1
this.logger.info('Publish data', { page });

7. Log the outcome, not the attempt

Verbose:

1
2
3
4
this.logger.info('Creating account in DA');
this.logger.info('Creating account in DB');
// do work
this.logger.info('Account created', { accountNumber });

Better:

1
2
3
// Optional this.logger.debug('Creating account in DA');
// Optional this.logger.debug('Creating account in DB');
this.logger.info('Account created', { accountNumber });

8. Catch - swallow must always log

N/A


I very often feel don’t have enough information for debugging. So tend to add noise into logger with info level.
Actually, if we have log on boundary and business outcome. It should be good. Other should be for debugging.

pino supports runtime level changes. We need to add endpoint, or listen to config, or request header for debug.

Two TypeScript Config Issues I Recently Hit

Recently I ran into two TypeScript configuration issues, so I’m sharing them here in case they save someone else a few hours of head-scratching.

1. Types: Vitest vs Jasmine

We’re in the process of migrating our tests from Karma (Jasmine) to Vitest.

At some point, I removed this line from one of my Vitest test files:

import '@testing-library/jasmine-dom';

Immediately, tons of lint and type errors popped up, all complaining that describe (and friends) could not be found.

This felt wrong — Vitest tests don’t use Jasmine at all, so why would removing a Jasmine import break things?

Our TS setup

We have three TypeScript config files:

  • tsconfig.json
    → base config, also used by the IDE (Cursor / VS Code)

  • tsconfig.app.json
    → referenced by tsconfig.json, used by the app

  • tsconfig.eslint.json
    → used by ESLint

  • tsconfig.spec.json
    → used by Karma tests

This distinction is important:

❌ Type errors during yarn lint
→ problem is in tsconfig.eslint.json

❌ Type errors shown in the IDE
→ problem is in tsconfig.json (this is what the IDE uses by default)

Fixing the lint issue

For linting, the fix was straightforward. Since we currently have both Jasmine and Vitest in the codebase, ESLint still needs to know about Jasmine globals:

1
2
3
env: {
jasmine: true,
}

That alone fixed the lint errors.

Fixing the type issue

The type errors were caused by this setting:

"types": []

When types is explicitly set, TypeScript stops auto-including global types. So we must add Vitest globals back manually:

"types": ["vitest/globals", "node"]

Additionally, we had to update typeRoots:

"typeRoots": ["./node_modules", "./node_modules/@types"]

Why?
Because the default is only node_modules/@types, but not all type definitions actually live there. Vitest’s types are under node_modules, so without this change, TypeScript simply couldn’t see them.

2. Paths: tsconfig.paths vs Yarn Workspaces

The second issue was related to path resolution, and it showed up only in Karma tests.

We have a library called ui-lib.

In TS files, this works fine:

import { something } from 'ui-lib/directive';

In global.scss, this also works:

@import 'ui-lib/directive/directive.scss';

But when running Karma tests, it failed with an error saying the SCSS path couldn’t be resolved.

The real cause

We had paths configured in tsconfig for ui-lib.

The problem is:
👉 tsconfig.paths is only understood by the TypeScript compiler, not by Karma, Sass, or other tooling unless explicitly wired up.

So Karma had no idea how to resolve that SCSS path.

Do we even need tsconfig.paths?

We’re using Nx + Yarn workspaces, which already handle package resolution very well.

So the question became: do we actually need tsconfig.paths here?

Answer: no.

If Yarn workspaces are working properly, you usually don’t need tsconfig.paths for internal packages.

There are valid use cases for paths, for example:

Creating an alias like @utils that points to a folder inside src

Aliases that are not real packages

But for workspace libraries like ui-lib, paths just add unnecessary complexity.

The fix

Stick with Yarn workspace resolution

Remove unnecessary tsconfig.paths

Update ui-lib’s exports field to correctly expose the paths we need

Once that was done, the SCSS resolution issue disappeared — in Karma and everywhere else.

Final takeaway

Be very clear about which TS config is used by which tool

If Yarn workspaces already solve the problem, don’t fight them

Hope this helps someone else avoid the same traps 🙂

Page Object Model Pattern

Every team seems to have its own pattern for writing E2E tests.
Joining a new team made me realise that Page Object Models (POM) are still surprisingly debated.
Why do people argue about it so much? Why is there no standard way?

The truth is: there is no universally agreed best practice for Page Objects in modern E2E testing.

Here’s my perspective after working across Selenium, Cypress, and now Playwright.

What Page Objects try to solve

At its core, POM tries to address two concerns:

  1. Encapsulation – keep UI knowledge and selectors inside page objects
  2. Separation of Responsibilities – tests describe behaviour, page objects describe UI interactions

Selenium: Strict Separation

In Selenium, I used to follow a very rigid pattern: page objects handled only actions and navigation, while assertions lived exclusively in the test files.

1
2
3
4
5
6
7
8
9
SendMoneyPage
.selectBSB()
.fillBSBAccount(bsb, accountName, accountNumber)
.tapCheck() // returns MatchResultPage
.run((page) => {
// Assertions ONLY in tests
expect(page.getContact()).toEqual({});
})
.tapContinue(); // returns next page

Rules:

  1. Page Objects contain no assertions
  2. Page Objects describe behaviour and navigation only
  3. Verification logic stays in tests

This may look strange if you’ve never used classic POM, but it follows the principles from Martin Fowler’s original article on Page Object

Cypress: Function-Based helpers

Cypress takes the opposite stance. It discourages Page Objects completely and encourages simple reusable functions:

1
2
3
4
5
function fillBSBAccount(bsb: string, accountName: string, accountNumber: string) {
cy.getByLabel('bsb').type(bsb);
cy.getByLabel('accountName').type(accountName);
cy.getByLabel('accountNumber').type(accountNumber);
}

Cypress philosophy:
“Just write the story.”
Its fluent command chain makes POM feel unnecessary.

Playwright: Pragmatic Page Objets

Playwright reintroduces Page Objects, but with a different philosophy.

The key difference:
Playwright has built-in auto-waiting, auto-retry, and strong assertions.

These features fundamentally change how POM should be structured.

The key realisation

The classic “Separation of Responsibilities” becomes less practical with Playwright.

Example: checking if a button is visible.

A strict POM approach:

1
expect(page.isButtonVisible()).toBe(true)

This is actually worse:

isButtonVisible() doesn’t retry

Assertions on booleans don’t retry

You bypass Playwright’s reliability system

The recommended approach:

1
expect(page.buttonLocator).toBeVisible(); // Built-in auto-retry

Playwright’s documentation emphasises::

Page objects simplify authoring by creating a higher-level API which suits your application and simplify maintenance by capturing element selectors in one place and create reusable code to avoid repetition.

So… is “Separation of Responsibilities” still relevant?

Not really.

E2E tests are naturally narrative-driven. They read like stories:

“Log in, click this, expect that.”

Cypress embraces this.
Playwright mostly embraces this too, with optional structure via Page Objects.

Encapsulation still matters — grouping selectors and common actions improves readability and maintenance.
But strict, academic POM rules? They matter far less today.

Final thought

There is no universal “correct” Page Object pattern anymore.
Modern frameworks — especially Playwright — optimise for reliability, not structure.

So instead of asking:

“What is the right Page Object pattern?”

A better question is:

“Why am I using this pattern, and is it helping my tests stay readable, maintainable, and reliable?”

That’s the actual purpose of Page Objects — everything else is just preference.

DynamoDB the Advantages and Considerations of Single Table Design

Developers familiar with Relational Database design often find themselves initially drawn to the familiar territory of normalization when designing data models for DynamoDB. This instinct typically leads them towards what’s known as multi-table design, where each entity or relationship resides in a separate table.

On the other hand, DynamoDB’s schemaless nature encourages a different approach: single-table design, where all entities and relationships coexist within a single table. However, it’s worth noting that these two designs represent extremes on a spectrum, rather than strict boundaries.

According to the official documentation, single-table design is often recommended. This article explores the advantages of single-table design based on practical experience.

Our project predominantly utilizes single-table design, largely influenced by 《The DynamoDB Book》, authored by advocate Alex DeBrie. Despite our commitment to single-table design, we still manage more than a dozen tables, albeit with a focus on storing related entities together.

Read more

AWS connect last agent call routing

Challenge

In a recent project, we were assigned the task to route calls to the last agent who interacted with the caller.
This requirement may seem fundamental, but it proved to be more complex than anticipated,
especially considering its integration into Salesforce via Salesforce Service Cloud Voice.

Read more

AWS client getaddrinfo EMFILE issue

Recently, we introduced AWS Cloud Map for service discovery, primarily to retrieve queue URLs. However, after deployment, we encountered intermittent errors weeks later, logged as getaddrinfo EMFILE events.ap-southeast-2.amazonaws.com. Not all requests triggered this error, indicating a selective issue.

Upon inspection, it became apparent that we were facing a socket timeout problem, a known issue in our setup. The remedy was simple: reusing our existing agent.

Read more

How to prevent duplicate SQS messages

Problem

In our system, queue processors must implement idempotency to prevent the double-processing of messages. Duplicate messages may arise in the following scenarios:

  1. Scheduler and Message Producer: The scheduler or message producer may be triggered multiple times, occasionally rerunning due to timeouts.

  2. Queue Management: If a lambda instance times out while processing a message, another instance may retrieve the same message if the visibility timeout is not properly set.

This can have terrible consequences. We aim to avoid sending duplicate emails or messages to our customers, not to mention inadvertently delivering duplicate gift cards.

So a generic idempotency mechanism is required.

Read more

A bug which should have been solved a week ago

Recently, we have a DS ticket that said a user got banner error periodically on the home page. That means we got some BE errors on API requests. I checked NewRelic and nothing exception was found. I checked error logs on our server and only a few 500 errors. I can’t find further information about these errors. So I believe it was caused by an unstable network or it might be caused. I was not working on that task.

Until yesterday we were going to solve a cache memory issue and I still not realized that was caused by cache memory. After I had submitted that PR of fixing cache memory, I decided to look that DS ticket again. I notice there’s some clue. I could find banner errors on fullstory and that means we did get some request errors. Then I checked request logs on Cloudflare and here are unsuccessful requests.

Read more

Auth0 lock issue

This Friday when we’re demo case, we noticed that it took a long time to display content on the Home page. It happened occasionally and soon we found it happened only when we had login into ULP but not the HOME page. Particularly, if you remove the cookie flag auth0.is.authenticated and refresh the page, you can reproduce it. You might have to wait for more than 10s.

Read more