Implement Notification Service¶
Overview¶
Notification service is a message delivery service that routes events from subscribed message brokers or event streams to an organization's communication channels.
By encapsulating channel-specific integration logic—such as transport protocols, authentication mechanisms, and payload schemas—the service allows client systems to emit notification events without direct dependency on external provider APIs or protocols.
Objectives:
-
Provide a uniform, HTTP-based interface for notification submission
-
Decouple internal systems from external communication channel protocols
-
Support multiple communication channels through modular integration adapters
-
Ensure consistent validation, delivery, and reliability behavior across channels
In Scope:
The Notification service is responsible for:
-
Accepting notification requests from client systems over HTTPS
-
Validating and normalizing incoming notification payloads
-
Routing notifications to one or more configured communication channels
-
Transforming normalized notifications into channel-specific message formats
-
Delivering messages using channel-appropriate protocols (e.g., HTTP-based APIs, webhooks)
-
Applying basic reliability mechanisms, including error handling and retry policies
Out of Scope:
-
The following capabilities are explicitly excluded from the scope of this service:
-
Complex workflow orchestration or multi-step notification logic
-
End-user identity management or user-level authorization
-
Lifecycle management of external communication platforms
-
Business-level alerting rules, escalation policies, or suppression logic
Integration Model:
Each communication channel shall be integrated through a dedicated adapter that encapsulates:
-
Channel-specific transport protocols
-
Authentication and credential handling mechanisms
-
Message payload transformation logic
-
Error mapping and retry semantics
The architecture should allow additional channels or protocols to be introduced without requiring changes to existing client integrations.
SAD - System architechture design¶
Logical View¶
The logical architecture separates core notification processing from channel delivery concerns.
The following diagram shows the high-level architecture of the system.
flowchart LR
%% Component
client[Client]
event_sources[Event Sources]
queue[Queue<br>Rate Limit Service]
notification_service[Notification Service]
communication_channel[Communication Channel]
%% Flow
client -- dispatch --> event_sources
event_sources -- publish event --> queue -- submit --> notification_service
notification_service -- route --> communication_channel In that:
-
Client dispatches notification events
-
Event Sources collect and publish notification events
-
Queue / Rate Limit Service buffers, throttles, and submits events
-
Notification Service processes events and routes notifications
-
Communication Channels deliver messages via channel-specific integrations
The communication channels are:
| Channel | Description | Typical Use Cases | Delivery Pattern |
|---|---|---|---|
| Slack | Sends messages to Slack workspaces via bot or webhook integrations | Ops alerts, CI/CD notifications, team updates | Webhook / Bot API |
| Delivers notifications through SMTP or email service providers | User notifications, reports, confirmations | SMTP / ESP API | |
| HTTP | Pushes events to external systems via HTTP callbacks (webhooks) | System-to-system integration, event propagation | REST / Webhook |
| SMS | Sends text messages to mobile numbers via telecom providers | OTP, critical alerts, transactional messaging | SMS Gateway API |
| GitHub | Posts notifications to GitHub (issues, comments, checks, statuses) | CI results, automation, workflow feedback | GitHub REST / GraphQL |
| Discord Channel | Sends messages to Discord servers or channels using bots or webhooks | Community alerts, bot notifications, monitoring | Webhook / Bot API |
As shown above, the notification system is intentionally modular, allowing different combinations of clients, ingestion methods, queues, and delivery channels depending on workload characteristics. In practice, workloads vary significantly by event volume, latency sensitivity, fan-out, and target audience. Rather than prescribing a single deployment pattern, the architecture supports multiple compositions optimized for specific use cases. In this assignment, we narrow the scope to one or two representative workloads that illustrate how these components are combined in real scenarios, starting with a notification workload driven by Cloud Build changes.
Workload¶
Notification: Cloud Build Change Notifications¶
This workload processes build and deployment events emitted by Google Cloud Build via configured notifiers. Build lifecycle events (such as build started, succeeded, or failed) are published to the notification pipeline, where they are evaluated against notifier rules and delivery conditions. The Notification Service formats and routes these events to designated Slack channels, ensuring that the appropriate engineering groups are notified in near real time about build status changes, failures, and deployment outcomes. This workload is optimized for CI/CD visibility, fast feedback loops, and team-level operational awareness.
flowchart LR
%% Component
subgraph Client
build[Build]
end
subgraph queue[Queue]
pub
sub
end
subgraph service[Service]
notification[Notification Service]
end
subgraph channel[Communication Channel]
slack[Slack]
end
%% Flow
build -- change status<br> emit event --> pub[PubSub Topic] -- sink --> sub[Subcription] -- http push --> notification -- push --> channel Within that, the following workflow is implemented:
sequenceDiagram
autonumber
participant build as Cloudbuild<br>Build
participant pubsub_topic as Pub/Sub<br>Topic
participant pubsub_subcription as Pub/Sub<br>Subscription
participant service as Notification<br>Service
participant channel_slack as Slack<br>Channel
build->>pubsub_topic: Emit build status event
pubsub_topic-->>pubsub_subcription: Deliver event message
pubsub_subcription->>service: HTTP push event
service->>service: Validate event and <br>generate channel-specific payloads
service->>channel_slack: Send generated notification content
channel_slack-->>service: Returned channel response
service-->>pubsub_subcription: Acknowledge message (1) Cloud Build emits lifecycle events Cloud Build generates events when a build changes state, such as started, succeeded, or failed. These events serve as the trigger for downstream notifications. For the full list of events, see Appendix 2: Build status enumeration
(2) Events are published to Pub/Sub Each build event is published to a Pub/Sub topic, providing durable storage and decoupling build execution from notification delivery. For the payload examples, see Appendix 3: Cloud Build Payload Examples
(3) Subscription pushes events to the Notification Service A Pub/Sub push subscription forwards the event to the Notification Service using an HTTP request. Delivery follows at-least-once semantics.
(4) Notification Service processes the event The Notification Service validates the event payload, applies notifier rules, and transforms the data into a Slack-compatible message format.
(5) Notification is delivered to Slack The service sends the formatted message to the configured Slack channel using a webhook or bot integration, making build status visible to the engineering team.
(6) Notification Service receives delivery confirmation from Slack After sending the formatted message to Slack, the Notification Service receives a delivery confirmation response from Slack, indicating that the message was successfully delivered to the designated channel.
(7) Acknowledgement is returned to Pub/Sub After successful processing and delivery, the Notification Service acknowledges the message to the Pub/Sub subscription. This acknowledgement signals that the event has been handled and prevents redelivery.
Physical View¶
The system is deployed on Google Cloud Platform (GCP) using a serverless architecture based on Cloud Run. Application components are packaged and delivered as container images, built and stored in an artifact registry, enabling consistent, portable deployments. This approach removes the need to manage underlying servers while providing automatic scaling, high availability, and operational simplicity.
The system operates across two isolated environments: development and production, which are distinguished by the configuration variable DEPLOYMENT_ENVIRONMENT_STATE. Production deployments are automatically triggered by changes to the production branch in the GitHub repository, initiating a CI/CD pipeline that builds the container image, publishes it to the Artifact Registry, and deploys it to Cloud Run. In contrast, the development environment relies on Docker and Docker Compose to run containerized services locally, enabling rapid iteration, testing, and debugging without impacting production workloads.
flowchart LR
%% Components
subgraph gh[GitHub]
repository[Repository]
end
subgraph channel[Communication Channel]
slack[Slack]
end
subgraph Google Cloud Platform
cbuild[Cloud Build]
subgraph intergrrated[Integrated]
direction TB
secret[Cloud Secret Manager]
logging[Cloud Logging]
iam[Cloud IAM]
artifact[Cloud Artifact]
end
subgraph notification[Notification Service]
run[Cloud Run]
end
end
%% Flow
gh <-- sync/trigger --> cbuild
cbuild -- build/store --> artifact
artifact -- create --> notification
cbuild -- deploy --> notification
secret <-. mount .-> notification
iam -- control access --> notification
notification -- yield logs --> logging
notification --> channel From a deployment standpoint, the Notification Service operates as a stateless workload on Cloud Run, with each instance running independently and managed entirely by the platform. Runtime configuration and sensitive information are injected via environment variables sourced from Google Secret Manager, while GCP IAM governs access control and service-to-service permissions. This architecture enables secure communication, elastic scaling, and safe instance lifecycle management without relying on local state.
The service interacts with external communication channels over secure protocols and is designed for horizontal scalability and operational resilience. Its core deployment characteristics include:
-
Horizontal, automatic scaling
-
Stateless request and event processing
-
Externalized configuration and secret management
-
Support for optional asynchronous extensions (e.g., queues or Pub/Sub)
As part of the deployment, the workload is configured to integrate with Cloud Build triggers by mapping build events to a Pub/Sub topic. The Notification Service registers its endpoint as a subscriber to this topic, allowing it to receive and process build-related events asynchronously. In addition, a Slack bot token must be configured with the appropriate permissions to use the slack_sdk.chat_postMessage API, enabling the service to send notification messages to designated Slack channels.
The detail workload that then mapping to physical services as shown below:
flowchart LR
%% Component
subgraph Google Cloud Platform
cbuild[Cloud Build]
subgraph ps[Pub/Sub]
pub[Topic]
sub[Subscription]
end
subgraph notification[Notification Service]
run[Cloud Run]
end
end
subgraph channel[Communication Channel]
slack[Slack]
end
%% Flow
cbuild --> pub --> sub --> run --> slack For the notification service, the following components are used:
| Code | Resource | Identifer | Description |
|---|---|---|---|
| COMPONENT-01 | GitHub | Repository notificaiton | Contain notification service |
| COMPONENT-02 | GitHub | Repository infra | Contain the infrastructure components in GCP |
| COMPONENT-03 | CloudBuild | Private in asia-southeast1 | CICD platform |
| COMPONENT-04 | PubSub [Topic] | Topic cloud-builds in asia-southeast1 | The topic that CloudBuild published messages into |
| COMPONENT-05 | PubSub [Sub] | Subcription, asia-southeast1 | The subcription of the topic cloud-builds |
| COMPONENT-06 | Cloud Run | $ENV-notification | Serverless of the notification |
| COMPONENT-07 | Slack | Slack application | Agent of notification |
The Appdendix 4: Component details shows the details of the system components
Based on the system components, the labels described below are applied to each component to clarify its role, responsibility, and deployment context.
| Label | Value | Description |
|---|---|---|
| team | thuyetbao | Owning team responsible for development, maintenance, and support |
| environment | development, production | Deployment environment used to separate non-prod and prod workloads |
| service | notification-service | Logical service name for identification and routing |
| cost-center | notification-platform | Cost allocation and billing tracking |
| compliance | internal, restricted | Data sensitivity or compliance classification |
Security & Compliance¶
Integration with Slack¶
The Notification Service integrates with Slack through a Slack Bot to send automated, event-driven notifications to designated channels using the chat.postMessage API. The bot authenticates via a Bot User OAuth Token, which is securely managed and injected at runtime through the system’s secret management mechanism.
The bot operates with a minimal permission set, ensuring secure message delivery while adhering to the principle of least privilege.
| Permission | Description |
|---|---|
chat:write | Send messages |
chat:write.customize | Send messages as with a customized username and avatar |
chat:write.public | Send messages to channels isn't a member of |
Follow the Slack Reference > Scope for more information
Service Account¶
Service account: Default service agent of PubSub¶
Identity: serviceAccount:service-${PROJECT_NUMBER}@gcp-sa-pubsub.iam.gserviceaccount.com
Alias: SA_AGENT_PUBSUB
| Permissions | Identifiers | Performtion |
|---|---|---|
roles/iam.serviceAccountTokenCreator | Subcription | generated OIDC token on a service account. |
Service account: sa-spirit-breaker - Run Subcription¶
Identity: serviceAccount:sa-spirit-breaker@${PROJECT_ID}.iam.gserviceaccount.com (Alias: sa_spirit_breaker)
Permission:
| Service | Permissions | Identifiers | Performtion |
|---|---|---|---|
sa_spirit_breaker | roles/run.invoker | Cloud Run | Invoke Cloud Run service related to COMPONENT-07 |
For the deployment of builder related to notification will in charge by:
Service account: sa-nature-prophet - Project builder¶
Service account: sa-nature-prophet@$PROJECT_ID.iam.gserviceaccount.com (Alias: sa-nature-prophet)
Permission:
| Permission | Identifiers | Performtion |
|---|---|---|
roles/cloudbuild.builds.builder | Cloud Build | Cloudbuild builder |
roles/iam.serviceAccountUser | Impersion service account on the Cloud Run service identity | |
roles/iam.serviceAccountTokenCreator | To impersonate a service account | |
roles/secretmanager.secretAccessor | Secret | Access secrets |
roles/storage.admin | GCS::dock-internal-store | Related to write log into bucket |
roles/artifactregistry.reader | Artifact Registry | Read the |
roles/run.developer | Cloud Run | Deploy Run service |
roles/run.services.setIamPolicy | Cloud Run | Set IAM for run service |
roles/monitoring.metricWriter | Cloud Monitoring | Writing monitoring data to a metrics scope |
roles/logging.logWriter | Cloud Logging | Write log entries |
Service account: sa-techies - Execute notification¶
Service account: sa-techies@$PROJECT_ID.iam.gserviceaccount.com (Alias: sa-techies)
Permission:
| Permissions | Identifiers | Performtion |
|---|---|---|
roles/run.developer | Cloud Run | To create or update a job |
roles/run.invoker | Cloud Run | To execute jobs or cancel job executions |
roles/artifactregistry.reader | Artifact Registry | Read the contain images on the job |
roles/secretmanager.secretAccessor | Secret Manager | Allows accessing the payload of secrets. |
roles/logging.logWriter | Logging | Write log entries. |
roles/errorreporting.writer | Error Reporting | Send error events to Error Reporting. |
roles/monitoring.metricWriter | Monitoring | Write metrics. |
roles/serviceusage.serviceUsageConsumer | Service Usage | Inspect service states and operations. |
roles/serviceusage.serviceUsageViewer | Service Usage | Inspect service states and operations. |
Implementation¶
The following procedures are used to implement the notification service
(1) Using terraform to provision related components: APIs, IAM, Pub/Sub
# ---------------------------------------------------------------------------------------------
# Service: Notification Service 🐲 -----------------------------------------------------------
# ---------------------------------------------------------------------------------------------
resource "google_project_service" "required_apis" {
for_each = toset([
"artifactregistry.googleapis.com",
"cloudbuild.googleapis.com",
"logging.googleapis.com",
"run.googleapis.com",
])
project = var.project_id
service = each.key
disable_on_destroy = false
}
# ---------------------------------------------------------------------------------------------
# Service Accounts ----------------------------------------------------------------------------
# ---------------------------------------------------------------------------------------------
resource "google_service_account" "sa_spirit_breaker" {
account_id = "sa-spirit-breaker"
display_name = "SA Spirit Breaker"
description = "Spirit Breaker - Charges down enemies from anywhere on the map"
}
resource "google_project_iam_member" "sa_spirit_breaker" {
for_each = toset([
"roles/run.invoker",
])
project = var.project_id
role = each.key
member = "serviceAccount:${google_service_account.sa_spirit_breaker.email}"
depends_on = [
google_service_account.sa_spirit_breaker,
]
}
resource "google_service_account" "sa_techies" {
account_id = "sa-techies"
display_name = "SA Techies"
description = "Techies - Surprises enemies with invisible landmines and explosive attacks"
}
resource "google_project_iam_member" "sa_techies" {
for_each = toset([
"roles/run.developer",
"roles/run.invoker",
"roles/secretmanager.secretAccessor",
"roles/logging.logWriter",
"roles/errorreporting.writer",
"roles/monitoring.metricWriter",
"roles/serviceusage.serviceUsageViewer",
"roles/serviceusage.serviceUsageConsumer",
])
project = var.project_id
role = each.key
member = "serviceAccount:${google_service_account.sa_techies.email}"
depends_on = [
google_service_account.sa_techies,
]
}
# ---------------------------------------------------------------------------------------------
# Cloud Run -----------------------------------------------------------------------------------
# ---------------------------------------------------------------------------------------------
data "google_cloud_run_service" "notification_service" {
name = "prod-notification-service"
location = var.project_region
project = var.project_id
}
# ---------------------------------------------------------------------------------------------
# Pubsub --------------------------------------------------------------------------------------
# ---------------------------------------------------------------------------------------------
resource "google_pubsub_topic" "notification_cloudbuild" {
name = "cloud-builds" # Cloud Build only push to this name (NOT CHANGE)
project = var.project_id
labels = {
team = "thuyetbao"
environment = "production"
}
message_retention_duration = "21600s"
}
resource "google_pubsub_subscription" "notification_cloudbuild" {
name = "prod-sub-notification-cloudbuild"
topic = google_pubsub_topic.notification_cloudbuild.id
message_retention_duration = "21600s"
retain_acked_messages = true
ack_deadline_seconds = 60
labels = {
team = "thuyetbao"
environment = "production"
}
expiration_policy {
ttl = "" # The resource never expires
}
push_config {
push_endpoint = "${data.google_cloud_run_service.notification_service.status[0].url}/event/cloudbuild"
oidc_token {
service_account_email = google_service_account.sa_spirit_breaker.email
}
}
retry_policy {
minimum_backoff = "30s"
maximum_backoff = "600s"
}
depends_on = [
google_service_account.sa_techies,
google_pubsub_topic.notification_cloudbuild,
]
}
(2) Configuration Slack application through App Endpoint in api.slack.com/apps with email account.
For this, use the manifest feature that deployed application through a yaml configuration file. See: App Manifest Reference
For the configiuration for Flycapcher, the name that represented for Slack agent
_metadata:
major_version: 2
minor_version: 1
display_information:
name: Flycatcher
long_description: >
Flycatcher is a Slack notification agent that processes events from the notification service
and generates interactive, rich messages to manage workflows like Cloud Build updates.
description: Flycatcher - Agent to delivery message to communication channels
background_color: "#0b1c4f"
features:
bot_user:
display_name: Flycatcher
always_online: true
oauth_config:
scopes:
bot:
- chat:write
- chat:write.customize
- chat:write.public
# - channels:history
# - groups:history
# - groups:read
# - groups:write
# - users.profile:read
# - commands
# - files:read
# - files:write
# - im:history
# - incoming-webhook
# - mpim:history
# redirect_urls:
# - https://example.com/slack/auth
settings:
org_deploy_enabled: false
socket_mode_enabled: false
token_rotation_enabled: false
# event_subscriptions:
# request_url: https://localhost/event/slack
# bot_events:
# - message.channels
# - message.groups
# - message.im
# interactivity:
# is_enabled: true
# request_url: https://example.com/slack/message_action
When finished build application, the following information is available:
Then add the application into the Slack orgatization (this required admin roles).
After finished, the Slack Bot OAuth Token is available. You can get that in Application > OAuth & Permissions > Bot User OAuth Token
The following is the sample of the token:
(3) Implement route /event/cloudbuild with FastAPI, Python
#!/bin/python3
# Global
import json
import random
# External
import structlog
from fastapi import (
APIRouter,
Depends,
status,
)
from fastapi.templating import Jinja2Templates
from jinja2 import Template
from slack_sdk import WebClient as SlackWebClient
from src.google.cloudbuild import CloudBuildStatus
# Internal
import dependencies
from config import __CONFIG__
# Context
from endpoint.events.cloudbuild.model import BuildPushNotificationPayload
router = APIRouter(
prefix="/event",
tags=["Event"],
)
# Construct
LOG: structlog.stdlib.BoundLogger = structlog.get_logger()
# The mapping of representation icon based on source
# For keys: ref to <https://docs.cloud.google.com/build/docs/api/reference/rest/v1/projects.builds#Build.Source>
representationIconMapping: dict[str, str] = {
"gitSource": "https://cdn-icons-png.flaticon.com/512/2111/2111292.png",
"repoSource": "https://cdn-icons-png.flaticon.com/512/2111/2111292.png",
"storageSource": " https://cdn-icons-png.flaticon.com/512/1975/1975660.png",
}
# The default channel if not specified
DEFAULT_CHANNEL_ID: str = "C0A6G9EPC11"
# The default maintainer if not specified
DEFAULT_MAINTAINER_MEMBER_ID: str = "U0A6G98D4CB"
# The default image for representation icon
DEFAULT_REPRESENTATION_ICON: str = "https://cdn-icons-png.flaticon.com/512/8637/8637097.png"
@router.post(
path="/cloudbuild",
description="Handle state change from Cloud Build",
summary="Handle state change from Cloud Build",
status_code=status.HTTP_200_OK,
)
def handleEventFromCloudBuild(
payload: BuildPushNotificationPayload,
templates: Jinja2Templates = Depends(dependencies.yield_template),
client_slack: SlackWebClient = Depends(dependencies.yield_slack_client),
):
# Filter message at the result
if payload.message.attributes.status not in CloudBuildStatus.get_result_status():
return {"status": "ok"}
# Template
tpl_build_metadata: Template = templates.get_template(name="event/cloudbuild/build_metadata.jinja")
tpl_call_maintainer: Template = templates.get_template(name="global/call_maintainer.jinja")
# Construct
build_event = payload.load_composite()
build_status = build_event.status
build_maintainer = build_event.substitutions.get("_BUILD_MAINTAINER_SLACK_ID", DEFAULT_MAINTAINER_MEMBER_ID)
build_slack_channel_id = build_event.substitutions.get("_BUILD_SLACK_CHANNEL_ID", DEFAULT_CHANNEL_ID)
build_representation_icon = representationIconMapping.get(list(build_event.source)[0], DEFAULT_REPRESENTATION_ICON)
current_thread_ts = None
# Component
component = build_event.model_dump() | {
"attachment_color": build_status.hex_color,
"message_detail": build_status.message,
# Source with encrypted the metadata
"reference_source": (
build_event.build_source_reference_name()
.replace(__CONFIG__.GOOGLE_PROJECT_ID, "---")
.replace(__CONFIG__.GOOGLE_PROJECT_LOCATION, "---")
),
"tags_composite": ", ".join(["`" + x.strip() + "`" for x in build_event.tags]),
"run_time": {
"timezone": {
"type": "UTC",
"format": "UTC",
},
"createTime": build_event.createTime.strftime("%Y-%m-%d"),
"timeframe": {
"from_hhmm": build_event.startTime.strftime("%H:%M:%S"),
"to_hhmm": build_event.finishTime.strftime("%H:%M:%S"),
},
"total_seconds": "{:.2f}".format(build_event.total_seconds()) if build_event.total_seconds() is not None else "---",
},
"representation_icon": build_representation_icon,
}
LOG.debug(f"Event content: {build_event}")
# Send
content_build_metadata = tpl_build_metadata.render(**component)
message_prefix_icon = random.choice(["stars", "milky_way", "sparkles", "package", "building_construction"])
message_title = f":{message_prefix_icon}: Build *{build_event.id}* has been deployed"
result_build_metadata = client_slack.chat_postMessage(
as_user=True,
channel=build_slack_channel_id,
text=message_title,
attachments=json.loads(content_build_metadata, strict=False)
)
result_build_metadata.validate()
current_thread_ts = result_build_metadata["ts"]
# If exists any failure, trigger call the maintainer of the pipeline
if all([
build_status in CloudBuildStatus.get_failure_status(),
any([
# Handler tags on production only
any([tag.lower().startswith("prod") for tag in build_event.tags]),
])
]):
# Send
content_call_maintainer = tpl_call_maintainer.render(user_id=build_maintainer)
result_call_maintainer = client_slack.chat_postMessage(
as_user=True,
channel=build_slack_channel_id,
thread_ts=current_thread_ts,
attachments=json.loads(content_call_maintainer, strict=False)
)
result_call_maintainer.validate()
return {"status": "ok"}
For the model of payload for Pub/Sub, cloudbuild event, using Pydatic to implement as follow:
#!/bin/python3
# External
from pydantic import (
BaseModel,
ConfigDict,
Field,
field_validator,
ValidationInfo
)
from src.google.cloudbuild import CloudBuildStatus
from src.google.pubsub import CloudPubSubPushMessage, SubcriptionIdentifier
from src.google.cloudbuild import CloudBuildResouceResult
class CloudPubSubPushNotification(BaseModel):
"""Push notification from Pubsub
Reference
---------
https://cloud.google.com/build/docs/subscribe-build-notifications#push
Example
-------
{
"message": {
"attributes": {
"buildId": "abcd-efgh...",
"status": "SUCCESS"
},
"data": "SGVsbG8gQ2xvdWQgUHViL1N1YiEgSGVyZSBpcyBteSBtZXNzYWdlIQ==",
"message_id": "136969346945"
},
"subscription": "projects/myproject/subscriptions/mysubscription"
}
"""
message: CloudPubSubPushMessage = Field(default=...)
subscription: SubcriptionIdentifier = Field(default=...)
@field_validator("subscription", mode="before")
@classmethod
def map_subcription_structure(cls, value: str):
"Parse projects/myproject/subscriptions/mysubscription"
_, project_id, _, subcription_id = value.split("/")
value = SubcriptionIdentifier.model_validate({
"project_id": project_id,
"subcription_id": subcription_id
})
return value
model_config = ConfigDict(
json_decoders = {
SubcriptionIdentifier: lambda v: v.identifier()
},
)
class CloudBuildPayload(BaseModel):
buildId: str = Field(default=..., description="ID of build")
status: CloudBuildStatus = Field(default=..., description="Status of build event")
model_config = ConfigDict(extra="allow")
@field_validator("status", mode="before")
@classmethod
def parse_status(cls, values, info: ValidationInfo):
try:
return CloudBuildStatus.search(keyword=values)
except ValueError as exc:
raise ValueError(f"{info.field_name} has unknown value") from exc
class CloudBuildPushMessage(CloudPubSubPushMessage):
"""CloudBuild messsage for an event
Usage
-----
{
"message": {
"attributes": {
"buildId": "abcd-efgh...",
"status": "SUCCESS"
},
"data": "SGVsbG8gQ2xvdWQgUHViL1N1YiEgSGVyZSBpcyBteSBtZXNzYWdlIQ==",
"message_id": "136969346945"
},
"subscription": "projects/myproject/subscriptions/mysubscription"
}
"""
attributes: CloudBuildPayload = Field(default=...)
class BuildPushNotificationPayload(CloudPubSubPushNotification):
"""Payload from subcription from CloudBuild event
Reference
---------
[Subscribe build notifications](https://cloud.google.com/build/docs/subscribe-build-notifications)
"""
message: CloudBuildPushMessage = Field(default=..., description="The message of Push payload from CloudBuild")
def load_composite(self) -> CloudBuildResouceResult:
metadata = self.message.decode_message_data()
return CloudBuildResouceResult.model_validate(metadata)
After fulfill the implement application, the service is available with document at /documentation
(4) Implement Dockerfile to build image
# Note: Bullseye is the latest Debien published at 2021
FROM python:3.12.8-bullseye
# Run in non-interactive mode
ARG DEBIAN_FRONTEND=noninteractive
# the language of a Linux system
ENV LANG=C.UTF-8
# Overrides the encoding used for stdin/stdout/stderr
ENV PYTHONIOENCODING=utf-8
# Prevent Python from writing pyc files to disc
ARG PYTHONDONTWRITEBYTECODE=1
# Prevent Python from buffering stdout and stderr
ARG PYTHONUNBUFFERED=1
# Using default bash shell
# ref: https://deepsource.io/directory/analyzers/docker/issues/DOK-DL4005
SHELL ["/bin/bash", "-c"]
# The commit ID associated with your build
ARG REVISION_ID
# The first seven characters of COMMIT_SHA
ARG SHORT_SHA
# The full name of your repository, including either the user or organization
ARG REPO_FULL_NAME
# Config for deployment environment state
ENV DEPLOYMENT_ENVIRONMENT_STATE=development
# Metdata
LABEL com.service.notification.service.name="notification-service"
LABEL com.service.notification.service.version="1.14.14"
LABEL com.service.notification.service.maintainer="trthuyetbao@gmail.com"
LABEL com.service.notification.service.contributors="trthuyetbao@gmail.com"
LABEL com.service.notification.service.tags="notification-service|supported-on:slack-email-sms"
LABEL com.service.notification.image.repo_full_name="$REPO_FULL_NAME"
LABEL com.service.notification.image.short_sha="$SHORT_SHA"
LABEL com.service.notification.image.revision_id="$REVISION_ID"
# Set working directory
WORKDIR /usr/bin/notification-service/
# Transfer
COPY ./backend/requirements.txt ./requirements.txt
COPY ./backend/requirements-dev.txt ./requirements-dev.txt
# Dependencies
RUN pip install --upgrade pip>=25.3;
RUN pip install --no-cache-dir -r requirements.txt --default-timeout 100;
# Tool
RUN if [[ \
"$DEPLOYMENT_ENVIRONMENT_STATE" == "development" ]]; then \
pip install -r requirements-dev.txt --default-timeout 100; \
fi;
# Sync application
COPY ./backend/ ./
# Cloud Run control $PORT
EXPOSE $PORT
# Exec
ENTRYPOINT [ "/bin/bash", "-l", "-c", "uvicorn entrypoint:app --host 0.0.0.0 --port $PORT --workers 1" ]
(5) Reference build with cloudbuild/production-deploy-application.yaml
# Run by: sa-nature-prophet@$PROJECT_ID.iam.gserviceaccount.com
tags: ["$_BUILD_ENV", "$_BUILD_ACTION", "sha-$SHORT_SHA"]
substitutions:
_BUILD_ENV: "production"
_BUILD_ACTION: "deploy-notification-service"
_BUILD_POOL: "private-pool-build-workers"
_BUILD_SERVICE_ACCOUNT_NAME: "sa-nature-prophet"
_BUILD_MAINTAINER_SLACK_ID: "U0A6G98D4CB" # thuyetbao
_BUILD_SLACK_CHANNEL_ID: "C0A6NLB2WRG" # Channel #all-thuyetbao
_ARTIFACT_REGISTRY_REPOSITORY_NAME: "dock-artifact-registry"
_RUN_APPLICATION_NAME: "prod-notification-service"
_RUN_APPLLICATION_LOCATION: "${LOCATION}"
_RUN_APPLICATION_DESCRIPTION: "[Notification] (Production) Flycatcher - Serverless notification service"
_RUN_IMAGE_NAME: "${_RUN_APPLICATION_NAME}"
_RUN_SERVICE_ACCOUNT_NAME: "sa-techies"
steps:
# Docker Build
- name: "gcr.io/cloud-builders/docker"
id: build_docker_image
dir: project/sandbox/notification-service
entrypoint: bash
args:
- "-c"
- |
# Setting
set -e
# On context
DOCKER_BUILDKIT=1 \
docker build \
--build-arg DEPLOYMENT_ENVIRONMENT_STATE=$_BUILD_ENV \
--build-arg REPO_FULL_NAME=$REPO_FULL_NAME \
--build-arg SHORT_SHA=$SHORT_SHA \
--build-arg REVISION_ID=$REVISION_ID \
-t $LOCATION-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPOSITORY_NAME/$_RUN_IMAGE_NAME:latest \
-f ./Dockerfile .
# Docker push to Google Artifact Registry
- name: "gcr.io/cloud-builders/docker"
id: push_artifact_registry
waitFor:
- build_docker_image
args:
[
"push",
"$LOCATION-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPOSITORY_NAME/$_RUN_IMAGE_NAME:latest",
]
# Service status
- name: "gcr.io/cloud-builders/gcloud"
id: verify_run_deployment_status
waitFor:
- push_artifact_registry
entrypoint: "bash"
args:
- "-c"
- |
# Setting
set -e
# Get Status
declare service_status=$(gcloud run services list \
--region=$LOCATION \
--filter="metadata.name=$_RUN_APPLICATION_NAME" \
--format="value(name)" \
--verbosity=none \
);
# Check run service exist
# If exist, then remove services for new deployment
if [[ ! -z "$service_status" ]];
then
gcloud run services delete $_RUN_APPLICATION_NAME \
--region=$LOCATION \
--verbosity=none \
--quiet;
fi;
# Deploy to Cloud Run
- name: "gcr.io/cloud-builders/gcloud"
waitFor:
- verify_run_deployment_status
args:
[
"run",
"deploy",
"$_RUN_APPLICATION_NAME",
"--description=$_RUN_APPLICATION_DESCRIPTION",
"--execution-environment=gen2",
"--allow-unauthenticated",
"--no-cpu-throttling",
"--region=$_RUN_APPLLICATION_LOCATION",
"--image=$LOCATION-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPOSITORY_NAME/$_RUN_IMAGE_NAME:latest",
"--service-account=$_RUN_SERVICE_ACCOUNT_NAME@$PROJECT_ID.iam.gserviceaccount.com",
"--cpu=1",
"--memory=512Mi",
# "--startup-probe=timeoutSeconds=10,httpGet.port=15555",
"--ingress=all",
"--min-instances=1",
"--max-instances=1",
"--concurrency=20",
"--timeout=5m",
"--set-env-vars=DEPLOYMENT_ENVIRONMENT_STATE=production",
"--set-env-vars=APPLICATION_LOGGING_LEVEL=INFO",
"--set-env-vars=APPLICATION_DEBUG_MODE=0",
"--set-env-vars=GOOGLE_PROJECT_ID=$PROJECT_ID",
"--set-env-vars=GOOGLE_PROJECT_LOCATION=$LOCATION",
# "--set-secrets=GOOGLE_PROJECT_ID=$PROJECT_ID",
# "--set-secrets=GOOGLE_PROJECT_LOCATION=$LOCATION",
"--set-secrets=SLACK_BOT_OAUTH_TOKEN=SLACK_BOT_OAUTH_TOKEN:latest",
"--set-secrets=SLACK_SIGNING_SECRET=SLACK_SIGNING_SECRET:latest",
"--set-secrets=SENDGRID_API_KEY=SENDGRID_API_KEY:latest",
"--labels=repository=$REPO_NAME,on_commit=$SHORT_SHA,team=thuyetbao,managed_by=cloudbuild,slack=flycatcher",
"--tag=latest",
]
timeout: "1200s"
logsBucket: "gs://dock-internal-store"
images:
- "$LOCATION-docker.pkg.dev/$PROJECT_ID/$_ARTIFACT_REGISTRY_REPOSITORY_NAME/$_RUN_IMAGE_NAME:latest"
options:
dynamicSubstitutions: true
logStreamingOption: "STREAM_ON"
pool:
name: "projects/$PROJECT_ID/locations/$LOCATION/workerPools/$_BUILD_POOL"
(6) Implement the build process by trigger production branch in GitHub repository
(7) Set variables environments for run executions
| Variable | Type | Description | Descroption |
|---|---|---|---|
| DEPLOYMENT_ENVIRONMENT_STATE | Env Var | Deployment Environment | One of: development, production |
| APPLICATION_LOGGING_LEVEL | Env Var | Logging level for application | One of: NOTSET, DEBUG, INFO, WARNING, ERROR, CRITICAL. |
| APPLICATION_DEBUG_MODE | Env Var | The debug mode | Config to true for debug in development environment |
| GOOGLE_PROJECT_ID | Secret | Cloud Project ID | Cloud Project ID |
| GOOGLE_PROJECT_LOCATION | Secret | Cloud Project Region | Cloud Project Region |
| SLACK_BOT_OAUTH_TOKEN | Secret | Slack Bot Token | Slack Bot Token |
| SLACK_SIGNING_SECRET | Secret | Slack Signing Secret | Slack Signing Secret |
| SENDGRID_API_KEY | Secret | Sendgrid API Key | Sendgrid API Key |
(8) Deploy the service into Cloud Run with the by trigger production branch in GitHub repository
After successful, the service is available
The URL for application look like https://prod-notification-service-[project-id].[region].run.app
(9) Verify the notification from process a CloudBuild buid event
Success build will generated following message in Slack
In case of error, the maintainer will be call:
For example, with build of my application for deployment orchestration, following is the example
Reference¶
-
Google instruction CloudBuild Notifier
-
PubSub messages returned subscribe-build-notifications
-
IAM for CloudRun with pre-defined roles run > docs > reference > iam > roles
Appendix¶
Appendix 1: Record of Changes¶
Table: Record of changes
| Version | Date | Author | Description |
|---|---|---|---|
| 0.9.10 | 2026/01/13 | thuyetbao | Updated reference code and runtime assets |
| 0.8.14 | 2026/01/13 | thuyetbao | Added Slack configuration and references |
| 0.7.9 | 2026/01/12 | thuyetbao | Added detail implementation codes |
| 0.6.30 | 2026/01/12 | thuyetbao | Updated physical view and component details |
| 0.5.12 | 2026/01/11 | thuyetbao | Updated introduction, logical view, and diagrams |
| 0.4.5 | 2026/01/06 | thuyetbao | Updated Build component and examples |
| 0.3.9 | 2026/01/04 | thuyetbao | Updated service account |
| 0.2.3 | 2026/01/04 | thuyetbao | Added SAD, introduction, and permissions |
| 0.1.0 | 2026/01/01 | thuyetbao | Initiation documentation |
Appendix 2: Build status enumeration¶
The enums of possible status of a build or build step has been defined below:
| State | Description |
|---|---|
| STATUS_UNKNOWN | Status of the build is unknown. |
| PENDING | Build has been created and is pending execution and queuing. It has not been queued. |
| QUEUED | Build or step is queued; work has not yet begun. |
| WORKING | Build or step is being executed. |
| SUCCESS | Build or step finished successfully. |
| FAILURE | Build or step failed to complete successfully. |
| INTERNAL_ERROR | Build or step failed due to an internal cause. |
| TIMEOUT | Build or step took longer than was allowed. |
| CANCELLED | Build or step was canceled by a user. |
| EXPIRED | Build was enqueued for longer than the value of queueTtl. |
Reference: API Reference Project Build Status
Appendix 3: Cloud Build payload examples¶
The payload of a Cloud Build event has been defined below:
createTime: '2018-02-22T14:49:54.066666971Z'
finishTime: '2018-02-22T14:50:05.463758Z'
id: bcdb9c48-d92c-4489-a3cb-08d0f0795a0b
images:
- us-east1-docker.pkg.dev/gcb-docs-project/quickstart-image
logUrl: https://console.cloud.google.com/cloud-build/builds/bcdb9c48-d92c-4489-a3cb-08d0f0795a0b?project=gcb-docs-project
logsBucket: gs://404889597380.cloudbuild-logs.googleusercontent.com
projectId: gcb-docs-project
results:
buildStepImages:
- sha256:a4363bc75a406c4f8c569b12acdd86ebcf18b6004b4f163e8e6293171462a79d
images:
- digest: sha256:1b2a237e74589167e4a54a8824f0d03d9f66d3c7d9cd172b36daa5ac42e94eb9
name: us-east1-docker.pkg.dev/gcb-docs-project/quickstart-image
pushTiming:
endTime: '2018-02-22T14:50:04.731919081Z'
startTime: '2018-02-22T14:50:00.874058710Z'
- digest: sha256:1b2a237e74589167e4a54a8824f0d03d9f66d3c7d9cd172b36daa5ac42e94eb9
name: us-east1-docker.pkg.dev/gcb-docs-project/quickstart-image:latest
pushTiming:
endTime: '2018-02-22T14:50:04.731919081Z'
startTime: '2018-02-22T14:50:00.874058710Z'
source:
storageSource:
bucket: gcb-docs-project_cloudbuild
generation: '1519310993665963'
object: source/1519310992.2-8465b08c79e14e89bee09adc8203c163.tgz
sourceProvenance:
fileHashes:
gs://gcb-docs-project_cloudbuild/source/1519310992.2-8465b08c79e14e89bee09adc8203c163.tgz#1519310993665963:
fileHash:
- value: -aRYrWp2mtfKhHSyWn6KNQ==
resolvedStorageSource:
bucket: gcb-docs-project_cloudbuild
generation: '1519310993665963'
object: source/1519310992.2-8465b08c79e14e89bee09adc8203c163.tgz
startTime: '2018-02-22T14:49:54.966308841Z'
status: SUCCESS
steps:
- args:
- build
- --no-cache
- -t
- us-east1-docker.pkg.dev/gcb-docs-project/quickstart-image
- .
name: gcr.io/cloud-builders/docker
status: SUCCESS
timing:
endTime: '2018-02-22T14:50:00.813257422Z'
startTime: '2018-02-22T14:50:00.102600442Z'
timeout: 600s
timing:
BUILD:
endTime: '2018-02-22T14:50:00.873604173Z'
startTime: '2018-02-22T14:50:00.102589403Z'
FETCHSOURCE:
endTime: '2018-02-22T14:50:00.087286880Z'
startTime: '2018-02-22T14:49:56.962717504Z'
PUSH:
endTime: '2018-02-22T14:50:04.731958202Z'
startTime: '2018-02-22T14:50:00.874057159Z'
Appdendix 4: Component details¶
1 | Repository notification¶
| Property | Value |
|---|---|
| ID | COMPONENT-01 |
| Resource | Repository |
| Identifier | Repository: Notification |
| Role | Contain notification service |
2 | Repository infra¶
| Property | Value |
|---|---|
| ID | COMPONENT-02 |
| Resource | Repository |
| Identifier | Repository: Notification |
| Role | Contain the infrastructure components in GCP |
3 | CloudBuild¶
| Property | Value |
|---|---|
| ID | COMPONENT-03 |
| Resource | Cloud Build |
| Identifier | Data project service |
| Role | CICD platform |
4 | Cloud PubSub [Topic]¶
| Property | Value |
|---|---|
| ID | COMPONENT-04 |
| Resource | Cloud PubSub |
| Identifier | Topic: cloud-builds |
| Role | Handle event from build |
| Project | $PROJECT_ID |
| Message Retention Duration | 6 hour, in 21600s seconds |
Note:
At the May 2024, Cloud Build not supported publish event into custom PubSub topic subscribe-build-notification. So that, do not change the name of the topic.
5 | Cloud PubSub [Subcription]¶
| Property | Value |
|---|---|
| ID | COMPONENT-05 |
| Resource | Cloud PubSub |
| Identifier | Subcription: prod-sub-notification-cloudbuild |
| Role | Subcription of the build events |
| Message Retention Duration | 6 hour, in 21600s seconds |
| Expriration period | Never expire |
| Acknowledgegement deadline | 60s |
| Retain acked messages | True |
| Exactly once delivery | False |
| Dead lettering | Not enabled |
| Push config: Endpoint | $RUN_NOTIFICATION_SERVICE_ENDPOINT/event/cloudbuild |
| Push config: Service Account | sa-spirit-breaker@${PROJECT_ID}.iam.gserviceaccount.com |
| Retry config | minimum_backoff="30s", maximum_backoff="600s" |
Note:
Get the RUN_NOTIFICATION_SERVICE_ENDPOINT from the successful deployment of the run service
6 | Notification Service on Cloud Run¶
| Property | Value |
|---|---|
| ID | COMPONENT-07 |
| Resource | $ENV-notification-service |
| Identifier | $ENV-notification-service |
| Role | CICD platform |
Handle the listen on CloudBuild at the route: /event/cloudbuild
Production:
| Property | Value |
|---|---|
| Deployment Strategy | CICD |
| Deployment Triggers | On push into production branch |
| Targeted resource | Google Cloud Platform |
| Endpoint | https://<service-name>-<hash>-<region>.a.run.app |
7 | Commnucation Channel: Slack¶
| Property | Value |
|---|---|
| ID | COMPONENT-08 |
| Resource | Slack |
| Identifier | Slack Application |
| Role | The application in charge for publish message in the agent |
Appdendix 5: Troubleshooting¶
More information:
- Troubleshooting for Cloud Run. See Troubleshoot Cloud Run issues
Error: failed to start and listen on the port¶
When deployed run, meet the following error:
ERROR: (gcloud.run.deploy) The user-provided container failed to start and listen on the port defined provided by the PORT=5000 environment variable within the allocated timeout. This can happen when the container port is misconfigured or if the timeout is too short. The health check timeout can be extended. Logs for this revision might contain more information.
gcloud run deploy not supported --port option, so remove related to that.
Appdendix 6: Test suite to validate the solution¶
#!/bin/python3
# External
from fastapi.testclient import TestClient
import structlog
import pytest
import base64
import json
import uuid
# Construct
LOG: structlog.stdlib.BoundLogger = structlog.get_logger()
structlog.contextvars.bind_contextvars(name="notification", pipeline="build.event")
@pytest.fixture
def on_path() -> str:
return "/event/cloudbuild"
@pytest.fixture
def payload() -> dict:
return {
"createTime": "2018-02-22T14:49:54.066666971Z",
"finishTime": "2018-02-22T14:50:05.463758Z",
"id": "bcdb9c48-d92c-4489-a3cb-08d0f0795a0b",
"images": [
"us-east1-docker.pkg.dev/space/notification-service"
],
"logUrl": "https://console.cloud.google.com/cloud-build/builds/bcdb9c48-d92c-4489-a3cb-08d0f0795a0b?project=space",
"logsBucket": "gs://404889597380.cloudbuild-logs.googleusercontent.com",
"projectId": "space",
"results": {
"buildStepImages": [
{
"sha256": "a4363bc75a406c4f8c569b12acdd86ebcf18b6004b4f163e8e6293171462a79d"
}
],
"images": [
{
"digest": "sha256:1b2a237e74589167e4a54a8824f0d03d9f66d3c7d9cd172b36daa5ac42e94eb9",
"name": "us-east1-docker.pkg.dev/space/notification-service"
},
{
"digest": "sha256:1b2a237e74589167e4a54a8824f0d03d9f66d3c7d9cd172b36daa5ac42e94eb9",
"name": "us-east1-docker.pkg.dev/space/notification-service:latest"
}
]
},
"source": {
"repoSource": {
"projectId": "thuyetbao",
"repoName": "space",
"branchName": "master",
"commitSha": "8465b08c79e14e89bee09adc8203c163"
}
},
"sourceProvenance": {
"fileHashes": {
"gs://space_cloudbuild/source/1519310992.2-8465b08c79e14e89bee09adc8203c163.tgz#1519310993665963": {
"fileHash": {
"value": "-aryrWp2mtfKhHSyWn6KNQ=="
}
}
},
"resolvedStorageSource": {
"bucket": "space-cloudbuild",
"generation": "1519310993665963",
"object": "source/1519310992.2-8465b08c79e14e89bee09adc8203c163.tgz"
}
},
"startTime": "2018-02-22T14:49:54.966308841Z",
"status": "SUCCESS",
"steps": [
{
"args": [
"build",
"--no-cache",
"-t",
"us-east1-docker.pkg.dev/space/build-notification-service",
"."
],
"name": "gcr.io/cloud-builders/docker",
"status": "SUCCESS",
"timing": {
"endTime": "2018-02-22T14:50:00.813257422Z",
"startTime": "2018-02-22T14:50:00.102600442Z"
}
}
],
"timeout": "600s",
"timing": {
"BUILD": {
"endTime": "2018-02-22T14:50:00.873604173Z",
"startTime": "2018-02-22T14:50:00.102589403Z"
},
"FETCHSOURCE": {
"endTime": "2018-02-22T14:50:00.087286880Z",
"startTime": "2018-02-22T14:49:56.962717504Z"
},
"PUSH": {
"endTime": "2018-02-22T14:50:04.731958202Z",
"startTime": "2018-02-22T14:50:00.874057159Z"
}
}
}
def test_push_success(client: TestClient, on_path: str, payload: dict):
on_status = "SUCCESS"
payload["status"] = on_status
message = {
"message": {
"attributes": {
"buildId": payload["id"],
"status": on_status,
"additionalProp1": {}
},
"data": base64.b64encode(json.dumps(payload).encode()).decode(),
"message_id": uuid.uuid4().hex,
"additionalProp1": {}
},
"subscription": "projects/{}/subscriptions/{}".format(
payload["projectId"],
"sub-cloudbuild-notification"
)
}
resp = client.post(
url=on_path,
headers={"Content-Type": "application/json"},
json=message
)
assert resp.status_code == 200
assert resp.json() == {"status": "ok"}
def test_push_failure(client: TestClient, on_path: str, payload: dict):
on_status = "FAILURE"
payload["status"] = on_status
payload["tags"] = ["production"] # To trigger call maintainer in production only
message = {
"message": {
"attributes": {
"buildId": payload["id"],
"status": on_status,
"additionalProp1": {}
},
"data": base64.b64encode(json.dumps(payload).encode()).decode(),
"message_id": uuid.uuid4().hex,
"additionalProp1": {}
},
"subscription": "projects/{}/subscriptions/{}".format(
payload["projectId"],
"sub-cloudbuild-notification"
)
}
resp = client.post(
url=on_path,
headers={"Content-Type": "application/json"},
json=message
)
assert resp.status_code == 200
assert resp.json() == {"status": "ok"}







