The post AppFabric Caching: What Now? appeared first on ScaleOut Software.
]]>Eighteen months ago we posted a blog on the performance and feature shortcomings of Microsoft’s Windows Server AppFabric (WSAF) Caching. Since then much has transpired. Microsoft announced earlier this year it will be ending support for Windows Server AppFabric 1.1 by April 2017. AppFabric Caching users now have to determine the right next step in migrating to an alternative distributed cache.
Recommended alternatives found lacking
With its “mobile first, cloud first” strategy, it appears that Microsoft is pushing customers to its Microsoft Azure cloud platform by recommending that “all Microsoft AppFabric customers using Cache to move to Microsoft Azure Redis Cache.” However, for many customers it is not yet practical to move to Azure, and a fully supported, on-premise solution is required for their distributed cache. Microsoft’s recommendation that customers move to Redis is both controversial and misleading since the Redis open source community does not recommend running on Windows.
Further, Redis is an in-memory database which works well on a single server but lacks many of the scalability and high-availability features of a mature, fully featured in-memory data grid, let alone real-time analytics and computing capabilities. This leaves customers with an uncertain and commercially unsupported future when migrating their on-premise, .NET compatible, distributed cache away from AppFabric Caching.
A better alternative
Luckily, ScaleOut Software has been loyally serving the .NET developer community with an industry-leading in-memory data grid for over a decade. ScaleOut’s architectural design philosophy focuses on delivering high performance with maximum ease-of-use. It employs a single, coherent architecture for integrating scalability and high availability; this architecture is transparently leveraged in all aspects of its in-memory data grid. We call this “scalable, highly available everything” — the platform goes beyond linear performance scaling for accessing grid objects and uses a common architecture for all features such as distributed locking, event processing, load-balancing, geographic replication, parallel computation and backup/restore.
The key benefits and advantages that set ScaleOut StateServer® apart from its industry peers are:
Peer-to-peer architecture:
Ease-of-use:
Extended functionality:
ScaleOut StateServer (and its product extensions) go far beyond AppFabric Caching’s basic capabilities and add important functionality, including:
Furthermore, unlike Redis and some of the other open-source alternatives, ScaleOut StateServer is fully commercially supported by ScaleOut Software for use on Windows (or Linux).
Replacement and migration options
Customers have two paths to choose from when planning their migration off of AppFabric Caching to ScaleOut Software’s distributed cache: retain source-code compatibility for their existing legacy applications or migrate their applications to fully take advantage of native ScaleOut APIs.
Source-code compatible library
The ScaleOut Windows Server AppFabric (WSAF) Caching Compatibility Library is a source-code compatible, drop-in replacement for Microsoft AppFabric Caching APIs. This allows existing customer applications using AppFabric to preserve the legacy AppFabric Caching API semantics and switch to ScaleOut StateServer without making any code changes and use familiar PowerShell commands to manage the distributed cache. This library ships as part of ScaleOut StateServer release 5.4 and later, as described in the WSAF Caching Compatibility Library Reference.
Native ScaleOut APIs
Customers also can rewrite their applications to use ScaleOut StateServer’s native APIs, which allows applications to take full advantage of the extended functionality mentioned above. Using a hybrid approach, native ScaleOut StateServer APIs can be used alongside AppFabric APIs. We have developed a detailed technical AppFabric Caching migration guide to help developers through this process.
Using the WSAF Caching Compatibility Library
The WSAF Caching Compatibility Library is easy to integrate into applications that use AppFabric Caching’s APIs. Here is an outline of the required steps:
For More Information
More information about all the AppFabric Caching replacement and migration resources available can be found at www.scaleoutsoftware.com/appfabric, including a valuable offer for former AppFabric Caching users. We hope that the power of our platform as both a replacement and an upgrade from Microsoft’s WSAF Caching has captured your interest. Regardless if you run on-premise or in the cloud, it may be the right next step for your application.
The post AppFabric Caching: What Now? appeared first on ScaleOut Software.
]]>The post AppFabric Caching: Retry Later appeared first on ScaleOut Software.
]]>Given all this, we thought it would be a good opportunity to see how we are doing relative to the competition, and in particular, relative to Microsoft’s AppFabric caching for Windows on-premise servers. In addition to looking at performance differences, we also want to compare ScaleOut StateServer (SOSS) to AppFabric on qualitative measures, such as features, ease of installation, and management.
Well, performance comparisons aren’t so easy since the AppFabric license agreement states: “You may not disclose the results of any benchmark tests of the software to any third party without Microsoft’s prior written approval.” So our comments will be confined to what testing we felt was valuable and how well SOSS performed.
In a recent customer engagement, the application needed to load 10M objects into each server of the IMDG’s cluster to make full use of high-end servers with 60GB memory capacity. Measuring object creation and access rates on a single server is a good test of the IMDG’s memory management and multithreading, and this is an area in which we have made several performance optimizations. Using a workload of random object sizes varying from 200B to 2KB, SOSS was able to load 2 million objects in 59.2 seconds and then sustain 18K read/update pairs per second to random objects. That’s actually quite fast. (We invite you to test AppFabric’s performance; contact us if you need a test application.)
We also looked at load-balancing and recovery times for this workload of 2M objects when adding a server to a 2-server cluster, removing the third server, and also just killing the third server. This measures how well the IMDG’s servers use multithreading to maximize network bandwidth during load-balancing, and it also evaluates failure detection and recovery algorithms. These are areas in which we have invested heavily to take advantage of 10 Gbps (and faster) networks and to handle intermittent network delays inherent in virtual server infrastructures. While handling an access load of 6K read/update pairs per second, SOSS was measured to complete load-balancing in less than 35 seconds for joins and leaves and also to complete recovery and restore full throughput in this amount of time after a server failure.
We were surprised to discover that AppFabric throws exceptions to the client application during load-balancing and recovery and due to security issues, as described in other blog posts. During load-balancing, the client gets the following exception when accessing the cache:
ErrorCode<ERRCA0017>:SubStatus<ES0006>:There is a temporary failure. Please retry later. (One or more specified cache servers are unavailable, which could be caused by busy network or servers. …)
You’ll also see a “Please retry later” exception (forever!) if the client runs with insufficient security authorization; we had to run the client as administrator to avoid problems without a much deeper investigation. The client also throws these exceptions during “graceful” host shutdowns and during recovery.
To keep application development as straightforward as possible, SOSS handles all exceptions that occur during membership changes and load-balancing, automatically retrying requests within the server as necessary. This avoids exposing the application to the details of the IMDG’s internal operations unless the IMDG is completely unreachable.
Our experience with customers consistently reinforces the need to keep middleware software as easy to install and use as possible, especially given that it is deployed as distributed software running on a cluster of servers. This philosophy manifests itself in all areas, including installation, application development, and cluster management. We believe that installing our software should be as straightforward as we can make it, requiring minimal knowledge of the host operating system and the fewest possible explicit configuration settings.
AppFabric caching does not share this design approach and requires the configuration of numerous components, including security policies, SQL Server or a network file share with the appropriate permissions, “lead” hosts, etc. Also, a long list of PowerShell commands for managing the cache must be learned and correctly used. Running AppFabric caching in production typically requires the use of a domain and is deeply tied into the Windows Server infrastructure. As expected, AppFabric requires the comprehensive knowledge of a Windows system administrator to install and manage.
In our testing, the net result was about a half-day investment in time on our part (and some frustration) getting AppFabric up and running. After spending an hour trying to join multiple cluster hosts in a Windows workgroup, we switched to using a domain controller to make this work; it just wasn’t worth the time to sort out the incorrect configuration settings. Head scratching was required in several other areas before our AppFabric cluster showed signs of life.
A major source of frustration with AppFabric is its use of PowerShell commands to manage the cache cluster. It’s easy to forget that distributed software is running on multiple servers which need to be orchestrated as a group, and that’s hard to do with a sequence of shell commands because you can’t track the state of the cluster at a glance. It’s much easier to manage a cluster with a graphical user interface (GUI) which shows the status of all hosts and alerts you to dynamic situations, such as high usage, load-balancing, or network outages.
To take full advantage of the GUI approach, SOSS uses a Windows-based management console with intuitive controls that make management of the IMDG simple and easy to learn. The console also adds advanced visualization features, such as integrated performance charts and tabular usage charts, a “heat” map showing the availability and dynamic state of all regions within the distributed store, and an object browser that lets you see stored objects and examine both their metadata and contents.
The following screenshots illustrate SOSS’s performance charting and heat map. Note that the status of the IMDG and all cluster hosts is instantly visible in the tree list at the left:
Because the GUI management console receives periodic updates from all cluster hosts, it stays tightly integrated with the cluster and dynamically updates the latest status. In contrast, the use of shell commands just gives you a snapshot of the state of the cluster at one instant in time. We also have observed that these results quickly can become out of sync with what client applications are actually experiencing. For example, a shell command can report that the cluster is in an unknown state when in fact the client is successfully completing access requests. (In AppFabric, be prepared to wait for several seconds for management commands to time out when a cluster host goes into an unknown state.)
AppFabric uses a single store, either a file share or SQL Server, to hold the cluster’s configuration, which adds complexity to installation and creates a single point of failure. Although SQL Server can be clustered, this adds even more cost than just using a single server. To avoid these problems, SOSS automatically replicates its configuration files on every cluster host to maximize availability with a fully peer-to-peer implementation. This approach also keeps the user from having to deal with managing a configuration store.
The peer-to-peer issue arises again in AppFabric’s requiring a majority of lead hosts (presumably) to reconfigure the cluster after a membership change. Because some hosts are lead hosts and others are not, an AppFabric caching cluster will go down even when hosts are healthy. Moreover, the administrator has to understand and ensure that a majority hosts quorum of lead hosts exists, and the number of lead hosts varies with cluster size. For example, a small cluster of up to 20 hosts might use 3 lead hosts, requiring two hosts to form a quorum. If 2 of the 3 lead hosts go down, the cluster will go down even if there are 18 healthy hosts.
With ScaleOut, you don’t need to know what the word “quorum” means. SOSS sidesteps these issues by using a fully peer-to-peer membership so that all hosts can participate in constructing the cluster membership. (SOSS makes use of a ScaleOut Software patent which eliminates the need for a majority hosts quorum by building a logical quorum on a uniform set of servers.) This means that SOSS can avoid the use of lead hosts and maintains service as long as any host survives. System administrators view all cluster hosts as peers and do not have to be aware of SOSS’s internal mechanisms for implementing the cluster membership.
It’s actually not clear to us whether AppFabric’s design philosophy regarding high availability is more closely aligned with “best effort” distributed caches like memcached or with mission-critical in-memory data grids like SOSS and others. Data replication is turned off by default and is explicitly set on a namespace-by-namespace basis using management commands. (High availability apparently is not available on Windows Server 2008R2 Standard Edition and requires Enterprise Edition, which can cost about $2,800 more per server, or you must upgrade to Windows Server 2012.) Also, since data is hosted in managed code, access delays due to garbage collection (as well as the exceptions noted above) are to be expected. (Microsoft recommends not storing more than 16GB in a cache host to avoid GC pauses.) Lastly, AppFabric’s client cache is not kept coherent with the distributed cache, and so the client cache cannot be used for transactional data.
In contrast, SOSS automatically replicates all stored objects on multiple hosts to maintain high availability at all times. Likewise, it uses an unmanaged heap for stored data to keep access times as predictable as possible and avoid GC pauses. It also keeps all client caches coherent with the IMDG so that multi-threaded code running on the cluster can coordinate access to transactional data using the well understood sequential consistency model.
It was tempting to write this blog post as a feature comparison between AppFabric and SOSS. It’s clear that, as a full in-memory data grid, SOSS — unlike AppFabric — incorporates many features that go well beyond distributed caching. For example, SOSS lets applications query stored data by property using Microsoft’s own LINQ, and ScaleOut Analytics Server (SOAS) can perform data-parallel analysis of queried objects using application code that SOAS automatically deploys on the cluster. ScaleOut hServer takes analytics a step further by hosting full Hadoop MapReduce on the IMDG.
That said, since AppFabric is targeted at distributed caching, we felt that AppFabric users likely are more focused on issues regarding deployment, performance, and availability. Beyond just evaluating how well products like AppFabric and SOSS extract performance from scale-up, it’s also important to examine how they stack up in their overall role of providing scalable, highly available in-memory storage.
When looking at other design approaches, we feel that ScaleOut’s philosophy of easy-to-use, fully peer-to-peer design with GUI-based management provides important dividends by simplifying deployment and maximizing visibility, while driving high performance and availability. Not surprisingly, all of this lowers the total cost of ownership, which — as we have seen — even for “free” software is never zero.
The post AppFabric Caching: Retry Later appeared first on ScaleOut Software.
]]>