Azure Service Fabric Application Patterns

There are many different ways to build and host application on top of Azure Service Fabric. My intention with this blog is to review the various options and give some context into why and when you might choose different approaches.

Application Models

Service Fabric enables you to bring existing applications onto the platform with little to no need for code modification. Currently the platform supports the following application models:

  • Windows Containers
  • Guest Executables
  • Reliable Stateless Services
  • Reliable Stateful Services
  • Reliable Actor Model

When migrating an existing application, you are likely to start by using either Windows containers or guest executables. Windows containers provide you with a way of packaging and isolating your application's dependencies in a portable form that is widely supported. Alternatively, guest executables allow you to simply host existing executables in a lightweight way on the fabric. This works very nicely with statically linked executables that don't really benefit from the dependency management that Windows containers offers. Once you've migrated your existing application(s) onto the fabric, you can then start to think about building richer, native Service Fabric services that leverage the Service Fabric Reliable Services SDK. Across all 5 of these application types, the platform gives you the benefits of high availability, scalability and managed ALM.

Windows Containers

Service Fabric's support for Windows containers is a great way to move existing Windows workloads to a modern application platform. If you have an existing Windows container then it's simple to host it on Service Fabric. If you don't then it's worth heading over to the Windows container docs (https://docs.microsoft.com/en-us/virtualization/windowscontainers/) to get started containerizing your application.

Visual Studio 2017 with Azure tooling comes with templates for creating a Service Fabric Windows container based service. From the new Service Fabric Application menu in Visual Studio, you can define your external container image. This could be hosted in a public container registry such as DockerHub, or in a private and secure registry such as those hosted on Azure Container Registry service.

As you can see in the above screenshot, there is a warning telling me that my local cluster doesn't support containers. This is in fact untrue, as I have Windows containers enabled on my Windows 10 machine. However, it does highlight that you are required to host your remote Azure cluster on the VM SKU "2016-Datacenter-with-containers" if you wish to run containers.

Service Manifest

A Windows Container Service Fabric service is effectively a hollow service. It really only consists of a ServiceManifest.xml. The key difference between a Windows Container service manifest and that of any other service manifest is the EntryPoint section.

<EntryPoint>  
      <ContainerHost>
        <ImageName>mycontainerimage</ImageName>
        <Commands>mycommand</Commands>
      </ContainerHost>
 </EntryPoint>

This section is telling Service Fabric that this is an externally hosted container image that it'll need to pull and run with the given command.

If you wish to expose your container as a web service or app, then you'll need to configure the container port mapping. This is done by defining an endpoint within the ServiceManifest.xml

<Resources>  
    <Endpoints>
      <Endpoint Name="ContainerServiceTypeEndpoint" Port="80" UriScheme="http" />
    </Endpoints>
  </Resources>
Application Manifest

To finish the configuration, you'll need to add an endpoint reference in your ApplicationManifest.xml.

<ServiceManifestImport>  
    <ServiceManifestRef ServiceManifestName="ContainerServicePkg" ServiceManifestVersion="1.0.0" />
    <ConfigOverrides />
    <Policies>
      <ContainerHostPolicies CodePackageRef="Code">
        <PortBinding ContainerPort="80" EndpointRef="ContainerServiceTypeEndpoint"/>
      </ContainerHostPolicies>
    </Policies>
  </ServiceManifestImport>

This config will inform Service Fabric to bind the container port 80 to the host port 80.

Guest Executables

When you have a .exe you just want to run on Service Fabric with minimal work, then Guest executables provide a simple method to do so. You can create a guest executable using Visual Studio tooling, PowerShell or the REST API. All these steps are documented here.

Reliable Stateless Services

The Reliable Stateless Service type allows you to host a service that has full native integration with the underlying fabric with no state management. Realistically, this service type is mainly used to host a transient web service of some form. Visual Studio 2017 comes with templates for creating ASP.NET Core stateless web services with the Azure tooling. This has recently migrated over from the OWIN based stateless web service template. As long as your stateless service plays nicely with the Service Fabric life cycle and hosting model then you can run whatever code you wish to.

The life cycle model is slightly different to a normal IIS hosted web server. Details of the difference can be found here.

Reliable Stateful Services

Unlike, the Reliable Stateless Service type mentioned above, the Reliable Statelful Service type does come with built-in state management. This is exposed to the user through the ReliableCollection types; ReliableDictionary<T,K>, ReliableQueue<T> and ReliableConcurrentQueue<T.

Each service has its own StateManager which does a number of things. Firstly, whilst maintaining the service's state in memory for fast access, it also pages the state to disk to ensure no data is lost during host machine reboots. Secondly, it pushes the state change onto its replication log and transmits it out to the other nodes in the network with a given sequence id. When the StateManager receives an accept message from a quorum of peers, without receiving a commit message with a higher sequence id, then it will consider the state committed, release any existing locks and publish a commit message.

If you design your stateful service as a singleton, then your service will be limit to the resources available on a single node within the cluster. In order to support more throughput and high volumes of data storage, you can partition your stateful service across the cluster. This effectively runs multiple instances of your service, each responsible for a particular subset of the total service's data. Therefore, for any given piece of data, which instance it ultimately ends up being stored by is determined by something called a PartitionKey. The stateful service's client can create a PartitionKey from anything and then use the Service Fabric Naming Service to resolve the service address of the responsible instance. The most suitable sources for PartitionKey keys tend to be something that would give a normal distributed spread of traffic across the stateful service instances.

Having to resolve the fabric address for a stateful service given some generated PartitionKey means that your stateful service's client needs to be able query the internal cluster's naming service. Therefore, if you wish to use your stateful service from outside the cluster, you need to implement the Gateway service pattern. The Gateway service pattern, means standing up a stateless service hosting a web server which can do the service resolution for you based on the headers or content of the request .i.e. you might use the session id from a HTTP request to an ASP.NET Web Api as the PartitonKey to resolve the stateful service instance you need to store the user's data.

It's worth understanding the limitations of the Reliable Stateful Services model. As ReliableCollections operations are all asynchronous due to the replication and disk I/O, they cannot simply implement the synchronous interface IEnumerable which is commonly associated with LINQ. Rather the ReliableCollections implement IAsyncEnumerable, which doesn't yet support the full LINQ API and thus you have to work with it slightly differently to how you might of expected.

while (await e.MoveNextAsync(cancellationToken).ConfigureAwait(false))
{
     doSomething(e.Current);
}

Alternatively, you can use the helper libraries people have written to wrap the standard IEnumerable interface, such as: https://gist.github.com/aelij/987d974c811865029564f1bbeffb6b47

The Reliable Stateful Services have also not been designed to support analytics, reporting or cold storage. For all 3 of these use cases, I strongly recommend you consider pushing/pull data into more appropriate downstream services such as CosmoDB, SQL DB, etc.
Given the above considerations, a pattern I often see used is as follows;

The Stateless Web Api exposes a REST/SOAP based API for upstream clients. The Web Api resolves the appropriate stateful service instance via the cluster's naming service and then forwards the request using the built in .NET Remoting Communication framework. The stateful service stores the state durably within the cluster and then asynchronously archives the data to a external data service. Any reporting or data analytics are then perform against the external data service.

Back up and Restore

When storing important data inside a Reliable Stateful Services, it's common practice to backup your data to an external repository for disaster recovery. It's important at this stage to highlight the differences between backup and archive. Backing up a service's state is to allow you to restore the service should the cluster incur data loss. This is different to archiving the data for audit, long term storage or any other uses. Therefore, the frequency with which you backup is directly proportional to the amount of data and up-time your service can potentially lose during a disaster. Therefore, it is very important to define your recovery point objective (RPO) and recovery time object (RTO) and align that with your service's backup strategy.

  • RPO: The amount of data you can afford to lose during a disaster
  • RTO: The amount of downtime you can afford to get your service back up and running after a disaster

Once you have defined how frequently you wish to backup your service, it's fairly trivial to implement. You can create a BackupDescription that tells the fabric whether to do incremental or full backups and which callback function to invoke when it needs to do a backup (something similar to below).

BackupDescription myBackupDescription = new 
BackupDescription(backupOption.Incremental, this.BackupCallbackAsync);
await this.BackupAsync(myBackupDescription);

For more details on how to setup back up and restore, please refer to the documentation. What I will say, is if you have a backup and restore strategy, please do test it! There's not point waiting for a disaster to test whether your disaster recovery works...

Encryption at rest

Currently Virtual Machine Scale Sets, which Service Fabric is based upon, does not support BitLocker. Therefore the VMs disk is not currently encrypted. Additionally, if you wish to store encrypted data within a ReliableCollection, you will need to encrypt/decrypt the data before/after using the collection. This can be done without too much effort and packaged as a NuGet for reuse across your services.

External Client Access

Once you've got your applications up and running in the cluster, now you probably want to expose some services externally for 3rd party clients and users to access. There are a number of ways to achieve this, I'll walk you through each below. Further details can be found here.

Addressable Ports

The simplest and least suitable method is to simply define a specific Port in the service manifest and expose it on any load balancers, firewalls, etc. This method is usually quick to try out but requires you to punch holes in your security every time you deploy a new service and also limits you to a single service instance per node due to port contention.

Reverse Proxy

Alternatively, you can use a reverse proxy to do service redirect. Service Fabric has its own reverse proxy which can be enabled or you can implement your own. The idea here is that there is a single entry point and therefore single port that needs to be externally addressable. This service then uses a request path convention i.e. /MyApplication/MyService to route traffic to its intended destination.

API Management

The API management (APIM) team have created a plug-in which integrates this routing logic into their service. This allows your clients to target your exposed APIM endpoint and let that service do the routing for you. Attaching the APIM to the same VNET as the cluster allows you keep communication between the service secure. You can configure APIM to fa├žade across a collection of APIs, add throttling, SSL termination, request transformations, etc.

Default Services

By default, whenever you add a new Service Fabric service to your Service Fabric application in Visual Studio, it will add it as a new default service within the ApplicationManifest.xml.

<DefaultServices>
     <Service Name="Guest1" ServicePackageActivationMode="ExclusiveProcess">
          <StatelessService ServiceTypeName="Guest1Type" InstanceCount="-1">
               <SingletonPartition />
          </StatelessService>
     </Service>
     <Service Name="ContainerService">
         <StatelessService ServiceTypeName="ContainerServiceType" InstanceCount="-1">
              <SingletonPartition />
         </StatelessService>
     </Service>
</DefaultServices>

This is convenient as every time you deploy your application, the fabric will automatically deploy each of the containing services. However, this violates the independent deployment pattern, stops your providing custom configuration and versioning and doesn't give you control of the start up order. I would advise not using default services at all and to rather deploy each service independently.

Summary

This was a brief run through of some Azure Service Fabric application patterns. I've tried to explain for which scenario you would use each of the different program models and how I've seen each used. This is not a comprehensive review of any of the models in depth as that is all documented in the azure docs.