high availability Archives - ScaleOut Software

CEO William Bain Gives Talk for the Digital Twin Consortium

Fri, 04 Aug 2023 00:25:33 +0000

On August 1^st, ScaleOut CEO Dr. William Bain gave a talk to members of the Digital Twin Consortium entitled “Unlocking the Power of Digital Twins for Streaming Analytics and Simulation of Large Systems.” You can download the slide deck from the link below.

In this talk, Dr. Bain described a new vision for digital twins that takes them beyond traditional applications to address challenges faced by managers of large systems with thousands or even millions of data sources. Digital twins can implement streaming analytics that continuously monitor these complex systems for emerging issues and help managers boost their situational awareness.

Numerous applications can benefit from this new use of digital twins. Examples described in the talk include tracking vehicle fleets and logistics networks, improving the safety of transportation systems, and assisting in disaster recovery.

ScaleOut Software’s in-memory computing technology makes it possible to simultaneously host thousands of digital twins and run both streaming analytics and simulations. The talk explains how this technology adds real-time aggregate analytics while lowering response times and scaling performance.

Download The Presentation Slides

Learn more about the ScaleOut Digital Twin Streaming Service.
Watch Dr. Bain’s previous digital twin talk.

The post CEO William Bain Gives Talk for the Digital Twin Consortium appeared first on ScaleOut Software.

]]>

ScaleOut Software Adds Google Cloud Support Across Products

Tue, 20 Jun 2023 20:58:25 +0000

New features enable users to deploy distributed caching with automatic connections and seamless scaling

BELLEVUE, WASH. — June 15, 2023 — ScaleOut Software today announced that its product suite now includes Google Cloud support. Applications running in Google Cloud can take advantage of ScaleOut’s industry leading distributed cache and in-memory computing platform to scale their performance and run fast, data-parallel analysis on dynamic business data. The ScaleOut Product Suite is a comprehensive collection of production-proven software products, including In-Memory Database, StateServer, GeoServer, Digital Twin Streaming Service, StreamServer and more. This integration complements ScaleOut’s existing Amazon EC2 and Microsoft Azure Cloud support to provide comprehensive multi-cloud capabilities.

“We are excited to add Google Cloud Platform support for hosting the ScaleOut Product Suite,” said Dr. William Bain, CEO of ScaleOut Software. “This support further broadens the public cloud options available to our customers for hosting our industry-leading distributed cache and in-memory analytics. Google’s impressive performance enables our distributed cache to deliver the full benefits of automatic throughput scaling to applications.”

Key benefits of ScaleOut’s support for the Google Cloud Platform include:

Simplified Deployment and Management: Users can take advantage of the ScaleOut Management Console to deploy a distributed cache to Google Cloud using a step-by-step wizard and track its status.
Automatic Clustering: Distributed caches comprising one or more virtual servers automatically create a single compute cluster to serve client requests with both scalability and high availability.
Automatic Client Connectivity: Client applications running either within Google Cloud or from remote sites can automatically connect to all caching servers within a cluster just by specifying the cluster’s name.
Elastic Performance: Using the management console, users can add or remove virtual servers from a distributed cache to meet the needs of application workloads. In addition, users can implement auto-scaling policies based on performance measures, such as memory usage.

Distributed caches, such as the ScaleOut Product Suite, allow applications to store fast-changing data, such as e-commerce shopping carts, stock prices, and streaming telemetry, in memory with low latency for rapid access and analysis. Built using a cluster of virtual or physical servers, these caches automatically scale access throughput and analytics to handle large workloads. In addition, they provide built-in high availability to ensure uninterrupted access if a server fails. They are ideal for hosting on cloud platforms, which offer highly elastic computing resources to their users without the need for capital investments.

For more information, please visit www.scaleoutsoftware.com and follow @ScaleOut_Inc on Twitter.

Additional Resources:

ScaleOut Google Cloud Platform blog post
ScaleOut Product Suite information

About ScaleOut Software

Founded in 2003, ScaleOut Software develops leading-edge software that delivers scalable, highly available, in-memory computing and streaming analytics technologies to a wide range of industries. ScaleOut Software’s in-memory computing platform enables operational intelligence by storing, updating, and analyzing fast-changing, live data so that businesses can capture perishable opportunities before the moment is lost. It has offices in Bellevue, Washington and Beaverton, Oregon.

Media Contact

Brendan Hughes

RH Strategic for ScaleOut Software
ScaleOutPR@rhstrategic.com
206-264-0246

The post ScaleOut Software Adds Google Cloud Support Across Products appeared first on ScaleOut Software.

]]>

Deploying ScaleOut’s Distributed Cache In Google Cloud

Tue, 20 Jun 2023 13:00:14 +0000

by Olivier Tritschler, Senior Software Engineer

Because of their ability to provide highly elastic computing resources, public clouds have become a highly attractive platform for hosting distributed caches, such as ScaleOut StateServer®. To complement its current offerings on Amazon AWS and Microsoft Azure, ScaleOut Software has just announced support for the Google Cloud Platform. Let’s take a look at some of the benefits of hosting distributed caches in the cloud and understand how we have worked to make both deployment and management as simple as possible.

Distributed Caching in the Cloud

Distributed caches, like ScaleOut StateServer, enhance a wide range of applications by offering shared, in-memory storage for fast-changing state information, such as shopping carts, financial transactions, geolocation data, etc. This data needs to be quickly updated and shared across all application servers, ensuring consistent tracking of user state regardless of the server handling a request. Distributed caches also offer a powerful computing platform for analyzing live data and generating immediate feedback or operational intelligence for applications.

Built using a cluster of virtual or physical servers, distributed caches automatically scale access throughput and analytics to handle large workloads. With their tightly integrated client-side caching, these caches typically provide faster access to fast-changing data than backing stores, such as blob stores and database servers. In addition, they incorporate redundant data storage and recovery techniques to provide built-in high availability and ensure uninterrupted access if a server fails.

To meet the needs of elastic applications, distributed caches must themselves be elastic. They are designed to transparently scale upwards or downwards by adding or removing servers as the workload varies. This is where the power of the cloud becomes clear.

Because cloud infrastructures provide inherent elasticity, they can benefit both applications and distributed caches. As more computing resources are needed to handle a growing workload, clouds can deploy additional virtual servers (also called cloud “instances”). Once a period of high demand subsides, resources can be dialed back to minimize cost without compromising quality of service. The flexibility of on-demand servers also avoids costly capital investments and reduces management costs.

Deploying ScaleOut’s Distributed Cache in the Google Cloud

A key challenge in using a distributed cache as part of a cloud-hosted application is to make it easy to deploy, manage, and access by the application. Distributed caches are typically deployed in the cloud as a cluster of virtual servers that scales as the workload demands. To keep it simple, a cloud-hosted application should just view a distributed cache as an abstract entity and not have to keep track of individual caching servers or which data they hold. The application does not want to be concerned with connecting N application instances to M caching servers, especially when N and M (as well as cloud IP addresses) vary over time. In particular, an application should not have to discover and track the IP addresses for the caching servers.

Even though a distributed cache comprises several servers, the simplest way to deploy and manage it in the cloud is to identify the cache as a single, coherent service. ScaleOut StateServer takes this approach by identifying a cloud-hosted distributed cache with a single “store” name combined with access credentials. This name becomes the basis for both managing the deployed servers and connecting applications to the cache. It lets applications connect to the caching cluster without needing to be aware of the IP addresses for the cluster’s virtual servers.

The following diagram shows a ScaleOut StateServer distributed cache deployed in Google Cloud. It shows both cloud-hosted and on-premises applications connected to the cache, as well as ScaleOut’s management console, which lets users deploy and manage the cache. Note that while the distributed cache and applications all contain multiple servers, applications and users can access the cache just by using its store name.

Building on the features developed for the integration of Amazon AWS and Microsoft Azure, the ScaleOut Management Console now lets users deploy and manage a cache in Google Cloud by just specifying a store name and initial number of servers, as well as other optional parameters. The console does the rest, interacting with Google Cloud to start up the distributed cache and configure its servers. To enable the servers to form a cluster, the console records metadata for all servers and identifies them as having the same store name.

Here’s a screenshot of the console wizard used for deploying ScaleOut StateServer in Google Cloud:

The management console provides centralized, on-premises management for initial deployment, status tracking, and adding or removing servers. It uses Google’s managed instance groups to host servers, and automated scripts use server metadata to guarantee that new servers automatically connect with an existing store. The managed instance groups used by ScaleOut also support defining auto-scaling options based on CPU/Memory usage metrics.

Instead of using the management console, users can also deploy ScaleOut StateServer to Google Cloud directly with Google’s Deployment Manager using optional templates and configuration files.

Simplifying Connectivity for Applications

On-premises applications typically connect each client instance to a distributed cache using a fixed list of IP addresses for the caching servers. This process works well on premises because the cache’s IP addresses typically are well known and static. However, it is impractical in the cloud since IP addresses change with each deployment or reboot of a caching server.

To avoid this problem, ScaleOut StateServer lets client applications specify a store name and credentials to access a cloud-hosted distributed cache. ScaleOut’s client libraries internally use this store name to discover the IP addresses of caching servers from metadata stored in each server.

The following diagram shows a client application connecting to a ScaleOut StateServer distributed cache hosted in Google Cloud. ScaleOut’s client libraries make use of an internal software component called a “grid mapper” which acts as a bootstrap mechanism to find all servers belonging to a specified cache using its store name. The grid mapper accesses the metadata for the associated caching servers and returns their IP addresses back to the client library. The grid mapper handles any potential changes in IP addresses, such as servers being added or removed for scaling purposes.

Summing up

Because they provide elastic computing resources and high performance, public clouds, such as Google Cloud, offer an excellent platform for hosting distributed caches. However, the ephemeral nature of their virtual servers introduces challenges for both deploying the cluster and connecting applications. Keeping deployment and management as simple as possible is essential to controlling operational costs. ScaleOut StateServer makes use of centralized management, server metadata, and automatic client connections to address these challenges. It ensures that applications derive the full benefits of the cloud’s elastic resources with maximum ease of use and minimum cost.

The post Deploying ScaleOut’s Distributed Cache In Google Cloud appeared first on ScaleOut Software.

]]>

Simulate at Scale with Digital Twins

Tue, 21 Feb 2023 14:00:39 +0000

Digital Twins Can Implement Both Streaming Analytics and Simulations

With the ScaleOut Digital Twin Streaming Service, the digital twin software model has proven its versatility well beyond its roots in product lifecycle management (PLM). This cloud-based service uses digital twins to implement streaming analytics and add important contextual information not possible with other stream-processing architectures. Because each digital twin can hold key information about an individual data source, it can enrich the analysis of incoming telemetry and extracts important, actionable insights without delay. Hosting digital twins on a scalable, in-memory computing platform enables the simultaneous tracking of thousands — or even millions — of data sources.

Owing to the digital twin’s object-oriented design, many diverse applications can take advantage of its powerful but easy-to-use software architecture. For example, telematics applications use digital twins to track telemetry from every vehicle in a fleet and immediately identify issues, such as lost or erratic drivers or emerging mechanical problems. Airlines can use digital twins to track the progress of passengers throughout an itinerary and respond to delays and cancellations with proactive remedies that smooth operations and reduce stress. Other applications abound, including health informatics, financial services, logistics, cybersecurity, IoT, smart cities, and crime prevention.

Here’s an example of a telematics application that tracks a large fleet of vehicles. Each vehicle has a corresponding digital twin analyzing telemetry from the vehicle in real time:

Applications like these need to simultaneously track the dynamic behavior of numerous data sources, such as IoT devices, to identify issues (or opportunities) as quickly as possible and give systems managers the best possible situational awareness. To either validate streaming analytics code for a complex physical system or model its behavior, it is useful to simulate the devices and the telemetry that they generate. The ScaleOut Digital Twin Streaming Service now enables digital twins to simplify both tasks.

Use Digital Twins to Simulate a Workload for Streaming Analytics

Digital twins can implement a workload generator that generates telemetry used in validating streaming analytics code. Each digital twin models the behavior of a physical data source, such as a vehicle in fleet, and the messages it sends and receives. When running in simulation, thousands of digital twins can then generate realistic telemetry for all data sources and feed streaming analytics, such as a telematics application, designed to track and analyze its behavior. In fact, the streaming service enables digital twins to implement both the workload generator and the streaming analytics. Once the analytics code has been validated in this manner, developers can then deploy it to track a live system.

Here’s an example of using a digital twin to simulate the operations of a pump and the telemetry (such as the pump’s temperature and RPM) that it generates. Running in simulation, this simulated pump sends telemetry messages to a corresponding real-time digital twin that analyzes the telemetry to predict impending issues:

Once the simulation has validated the analytics, the real-time digital twin can be deployed to analyze telemetry from an actual pump:

This example illustrates how digital twins can both simulate devices and provide streaming analytics for a live system.

Using digital twins to build a workload generator enables investigation of a wide range of scenarios that might be encountered in typical, real-world use. Developers can implement parameterizable, stateful models of physical data sources and then vary these parameters in simulation to evaluate the ability of streaming analytics to analyze and respond in various situations. For example, digital twins could simulate perimeter devices detecting security intrusions in a large infrastructure to help evaluate how well streaming analytics can identify and classify threats. In addition, the streaming service can capture and record live telemetry and later replay it in simulation.

Use Digital Twins to Simulate a Large System with Many Entities

In addition to using digital twins for analyzing telemetry, the ScaleOut Digital Twin Streaming Service enables digital twins to implement time-driven simulations that model large groups of interacting physical entities. Digital twins can model individual entities within a large system, such as airline passengers, aircraft, airport gates, and air traffic sectors in a comprehensive airline model. These digital twins maintain state information about the physical entities they represent, and they can run code at each time step in the simulation model’s execution to update digital twin state over time. These digital twins also can exchange messages that model interactions.

For example, an airline tracking system can use simulation to model numerous types of weather delays and system outages (such as ground stops) to see how their system manages passenger needs. As the simulation model evolves over time, simulated aircraft can model flight delays and send messages to simulated passengers that react by updating their itineraries. Here is a depiction of an airline tracking simulation:

In contrast to the use of digital twins for PLM, which typically embody a complex design within a single digital twin model, the ScaleOut Digital Twin Streaming Service enables large numbers of physical entities and their interactions to be simulated. By doing this, simulations can model intricate behaviors that evolve over time and reveal important insights during system design and optimization. They also can be fed live data and run faster than real time as a tool for making predictions that assist decision-making by managers (such as airline dispatchers).

Scalable, In-Memory Computing Makes It Possible

Digital twins offer a compelling software architecture for implementing time-driven simulations with thousands of entities. In a typical implementation, developers create multiple digital twin models to describe the state information and simulation code representing various physical entities, such as trucks, cargo, and warehouses in a telematics simulation. They create instances of these digital twin models (simply called digital twins) to implement all of the entities being simulated, and the streaming service runs their code at each time step being simulated. During each time step, digital twins can exchange messages that represent simulated interactions between physical entities.

The ScaleOut Digital Twin Streaming Service uses scalable, in-memory computing technology to provide the speed and memory capacity needed to run large simulations with many entities. It stores digital twins in memory and automatically distributes them across a cluster of servers that hosts a simulation. At each time step, each server runs the simulation code for a subset of the digital twins and determines the next time step that the simulation needs to run. The streaming service orchestrates the simulation’s progress on the cluster and advances simulation time at a rate selected by the user.

In this manner, the streaming service can harness as many servers as it needs to host a large simulation and run it with maximum throughput. As illustrated below, the service’s in-memory computing platform can add new servers while a simulation is running, and it can transparently handle server outages should they occur. Users need only focus on building digital twin models and deploying them to the streaming service.

The Next Generation of Simulation with Digital Twins

Digital twins have historically been employed as a tool for simulating increasingly detailed behavior of a complex physical entity, like a jet engine. The ScaleOut Digital Twin Streaming Service takes digital twins in a new direction: simulation of large systems. Its highly scalable, in-memory computing architecture enables it to easily simulate many thousands of entities and their interactions. This provides a powerful new tool for extracting insights about complex systems that today’s managers must operate at peak efficiency. Its analytics and predictive capabilities promise to offer a high return on investment in many industries.

The post Simulate at Scale with Digital Twins appeared first on ScaleOut Software.

]]>

Steve Smith Review: Simplify Redis Clustering with ScaleOut IMDB

Mon, 05 Dec 2022 22:38:25 +0000

Check out the blog post and video from distinguished software architect and .NET guru Steve “ardalis” Smith on the challenges of scaling single-server Redis and how ScaleOut In-Memory Database tackles them with fully automated cluster technology to avoid complex manual configuration steps.

Steve Smith is a well-known entrepreneur and software developer. He is passionate about building quality software and spreading his knowledge through training workshops, speaking at developer conferences, and sharing his experience on his blog and podcast. Steve Smith has also been recognized as a Microsoft MVP for over ten consecutive years.

The post Steve Smith Review: Simplify Redis Clustering with ScaleOut IMDB appeared first on ScaleOut Software.

]]>

Introducing A New Execution Platform for Redis Clients

Tue, 29 Mar 2022 13:00:46 +0000

The Challenge

Redis^®* offers a compelling set of data structures that enhance the capabilities of a distributed cache beyond just storing serialized objects. Created in 2009 as a single-server store to assist in the design of a web server, Redis gives applications numerous useful options for organizing stored data, including sets, lists, and hashes. Cluster support was added later, and it introduced specialized concepts, like hashslots and master/replica shards, that system administrators must understand and manage. Along with its use of eventual consistency, this has created complexity that makes cluster management challenging while reducing flexibility in configurations.

In contrast, ScaleOut StateServer®, a distributed cache for serialized objects and first released in 2005, was designed from the ground up to run on a server cluster with automated load-balancing, data replication, and recovery while storing data with full consistency (i.e., sequential consistency) across replicas. It also executes client requests using all available processing cores for maximum throughput. These features dramatically simplify cluster management, especially for enterprise users, improve flexibility, and lower TCO. For example, unlike Redis, ScaleOut server clusters can seamlessly grow from a single to multiple servers, and system administrators do not need to manage hashslots or master/replica shards. See a recent blog post that discusses how ScaleOut StateServer simplifies cluster management in comparison to Redis.

ScaleOut Software recognized that running Redis commands on a ScaleOut StateServer cluster would offer Redis users the best of both worlds: familiar and rich data structures combined with significantly simpler cluster management and full data consistency. However, the ideal implementation would need to use Redis open-source code to execute Redis commands so that client commands would behave identically to open-source Redis clusters. The challenge is then to integrate Redis code into ScaleOut StateServer’s execution platform and take advantage of ScaleOut’s highly automated clustering features while eliminating the single-threaded constraints of Redis’s event-loop architecture.

Integrating Redis into ScaleOut StateServer

Released as a community preview, version 5.11 of ScaleOut StateServer introduces support for the most popular Redis data structures (strings, sets, lists, hashes, and sorted sets) plus publish/subscribe commands, transactions, and various utility commands (such as FLUSHDB and EXPIRE). Both Windows and Linux versions are available. This release uses open-source Redis version 6.2.5 to process Redis commands.

Redis clients connect to any ScaleOut StateServer server in a cluster using the standard RESP protocol. (A cluster can contain one or more servers.) Client libraries internally obtain the mapping of hashslots to servers using either the CLUSTER SLOTS or CLUSTER NODES commands and then direct Redis access requests to the appropriate ScaleOut server. To maximize throughput, each ScaleOut server processes incoming Redis commands on multiple threads using all available processor cores; there is no need to deploy multiple shards on each server for this purpose.

The following diagram shows a set of Redis clients connecting to a ScaleOut StateServer cluster. Note that the complexities of hashslots and shards have been eliminated:

As the need for additional throughput grows, system administrators can simply join new servers to the cluster. ScaleOut StateServer automatically rebalances the hashslots across the cluster as servers are added or removed. It also delays execution of Redis commands during load-balancing (and recovery) to give clients a consistent picture of hashslot placement and avoid client exceptions. After a hashslot has fully migrated to a remote server, a requesting client is returned the Redis -MOVED indication so that it can redirect its request to the new server.

The following diagram illustrates how ScaleOut StateServer automatically manages hashslots. In this example, it migrates half of the hashslots to a second server that joins a cluster:

ScaleOut StateServer automatically creates replicas for all hashslots. There is no need for system administrators to manually create master and replica shards or move them from server to server during membership changes and recovery. ScaleOut StateServer automatically places replicas on different servers from their corresponding primary hashslots and migrates them as necessary during membership changes to ensure optimal load-balancing. If a server fails or has a network outage, ScaleOut StateServer automatically “self-heals” by promoting replicas to primaries and creating new replicas as necessary.

To avoid serving stale data to clients after recovery from an outage, ScaleOut StateServer uses a patented quorum algorithm to implement fully consistent updates to stored objects. In contrast, Redis uses an eventual consistency model for updating replicas. (To maximize throughput at the expense of data consistency, ScaleOut StateServer can optionally be configured for eventual consistency.) When a server receives a Redis command, it executes this command on a quorum containing the primary hashslot and replicas (one or two in the current implementation) prior to returning to the client. Transactions are processed in the same manner.

The following diagram compares the full and eventually consistent models for updating replicas and shows how they differ in behavior. A fully consistent update waits for the replica to be updated prior to returning to the client, whereas an eventually consistent update does not. If a primary server should fail prior to committing the replica’s update, the cluster could lose the update and serve stale data to clients.

Implementation Details

The following diagram shows how Redis open-source code has been integrated into ScaleOut StateServer:

Redis open-source code (shown in the red box) implements command parsing and processing, the data structure commands, transactions, publish/subscribe commands, and blocking commands. ScaleOut StateServer takes over all clustering functions, including request processing, membership, quorum processing of updates, load-balancing, recovery, and self-healing. It also uses a proprietary transport protocol for server-to-server communication.

As illustrated below, ScaleOut StateServer uses multi-threaded execution for Redis commands to take advantage of all processing cores and eliminate the need for multiple primary shards on each server. In contrast, Redis executes commands using an event loop that processes commands sequentially on a single processing core:

To accomplish this, ScaleOut StateServer has implemented a command scheduler that independently executes commands for each hashslot so that they can run in parallel without global locking.

What’s Missing?

The community preview release focuses on demonstrating support for Redis data structures, which represent the widely used core of Redis functionality. It does not include support for Redis streams, Lua scripting, modules, AOL/RDB persistence, ACLs, and Redis configuration files. In addition, many utility commands which are not required, such as cluster commands for manually moving hashslots, are not supported. Lastly, this version does not incorporate all of the performance enhancements in development for the production release.

Summing Up

ScaleOut’s new integration of Redis open-source code into ScaleOut StateServer was designed to bring powerful new capabilities to Redis users while ensuring native-Redis behavior for client applications. Targeted to meet the needs of enterprise users, it dramatically simplifies the management of Redis clusters by automating all cluster operations, and it ensures that fully consistent updates are performed by Redis commands. In addition, this integration runs alongside ScaleOut StateServer’s native APIs, which incorporate advanced features not available on open-source Redis clusters, such as data-parallel computing, streaming analytics, and coherent, wide-area data replication.

ScaleOut Software is excited to hear your feedback about the community preview and learn what additional features you would like to see in the upcoming production release. You can download ScaleOut StateServer, which incorporates the preview release, here for Linux or Windows and try it out now. Let us know what you think.

*Redis is a registered trademark of Redis Ltd. and the Redis box logo is a mark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by ScaleOut Software is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and ScaleOut Software.

The post Introducing A New Execution Platform for Redis Clients appeared first on ScaleOut Software.

]]>

Redis vs ScaleOut: What You Need to Know

Mon, 09 Aug 2021 18:57:36 +0000

ScaleOut’s Battle-Tested Clustering Technology Give It Key Advantages Over Redis®* in Ease of Use and Performance

By William L. Bain and Bryce C. Klinker

Breaking news: ScaleOut Software has announced a community preview of support for Redis clients in ScaleOut StateServer. Learn more here.

Distributed caching technology first hit the market in about 2001 with the introduction of Tangosol Coherence and has been evolving ever since. Designed to help applications scale performance by eliminating bottlenecks in accessing data, this distributed computing technology stores live, fast-changing data in memory across a cluster of inexpensive, commodity servers or virtual machines. The combination of fast, memory-based data storage and throughput scaling with multiple servers results in consistently fast access and update times for growing workloads, such as e-commerce, financial services, IoT device tracking, and other applications.

ScaleOut Software introduced its distributed caching product, ScaleOut StateServer® (SOSS), in 2005 and has made continuous enhancements over the last 16 years. While the single-server version of Redis was released in 2009 by Salvatore Sanfilippo, clustering support was first added in 2015. These two products embody highly different design goals. SOSS was designed as an integrated distributed caching architecture incorporating transparent throughput scaling and high availability using data replication with the goals of maximizing performance, ease of use, and portability across operating systems. In contrast, according to M. Russo, Redis was conceived as a single-server, data-structure store to improve the performance of a real-time data analytics product. (Beyond just storing strings or opaque objects, a data-structure store also implements various data types, such as lists and sorted sets.) Clustering was added to Redis’ single-server architecture after 4 years to provide a way to scale.

As background for the following discussion, it’s important to review some key concepts. Most distributed caches use a key/value storage model that identifies stored objects using string keys. To distribute objects across multiple servers in a cluster, a distributed cache typically maps keys to hash slots, each of which holds a subset of objects. The cache then distributes hash slots across the servers and moves them between servers as needed to balance the workload; this process is called sharding. A group of hash slots running on a single server (called a node here) can either be a primary or replica. Clients direct updates to the target hash slot on a primary node, which replicates the update to one or more replica nodes for high availability in case the primary node fails.

Ease of Use

The differences in design goals of the two technologies have led to very different impacts on users. To maximize ease of use, SOSS automatically creates and manages hash slots for the user, including primaries and replicas. Using a built-in load-balancer, each service internally manages a subset of both primary and replica hash slots, as illustrated below. Users just create a single SOSS service process on every node, and these service processes discover each other and distribute the hash slots among themselves to balance the workload. They also automatically handle all aspects of recovery after a node fails.

In contrast, Redis users create separate service processes on each node for primary and replica hash slots and must manually distribute the hash slots among the primaries. (Unlike SOSS, a 1-node or 2-node Redis cluster is not allowed.) As we will see below, users must perform a complex set of manual actions when adding and removing nodes and to heal and rebalance the cluster after a node fails. The following diagram illustrates the difference between Redis and SOSS in the user’s view of the cluster:

Adding a Node to the Cluster Using SOSS

To illustrate how SOSS’s built-in mechanisms for managing hash slots, load-balancing, failure detection, and self-healing simplify cluster management, let’s look at the steps needed to add a node to the cluster. When using SOSS, the user just installs the service on a new node and clicks a button in the management console to join the cluster. Using multicast discovery (or optional host list if multicast is not available), the service process automatically receives primary and replica hash slots and starts handling its portion of the workload. The following diagram shows the addition of a fourth node to a cluster:

Adding a Node to the Cluster Using Redis

Because Redis requires the user to manage the creation of primary and replica service processes (sometimes called shards) and the management of hash slots, many more steps must be performed to add a node to the cluster. To accomplish this, the user runs administrative commands that create the new processes, connect the primaries and replicas, move the replicas as necessary, and reallocate the hash slots among the nodes. The required configuration changes are illustrated below:

Here is an example of administrative steps required to make the configuration changes (using node 0’s IP and port as the bootstrap address for the new node):

// Start up a new replica redis-server instance on node 3 for primary 2:
redis-cli --cluster add-node host3Ip:replicaPort node0Ip:node0Port --cluster-slave 
          --cluster-master-id primary2NodeID
// Start up a new primary redis-server instance on node 3:
redis-cli --cluster add-node host3Ip:primaryPort existingIp:existingPort 
// Connect to replica 2 on node 0 and modify it to replicate primary 3: 
redis-cli -h replica2Ip -p -replica2Port > cluster replicate primary3NodeID
// Reshard the cluster by interactively moving hash slots from existing nodes to node 3:
redis-cli --cluster reshard existingIp:existingPort
> How many slots to move? 4096 //16384 / 4 = 4096
> What node to move slots to? primary3NodeID // (primary3NodeID returned by previous command)
> What nodes to move slots from? all

This process is complex, and it becomes more difficult to keep track of the distribution of hash slots with larger cluster memberships. Removing a node has comparable complexity.

Recovering After a Node Fails (SOSS and Redis)

SOSS’s service processes automatically detect and recover from the loss of a node. They use built-in, scalable, peer-to-peer heart-beating to detect missing node(s) and create a new, coherent cluster membership. Next, they promote replica hash slots to primaries on the surviving nodes, create new replicas for self-healing, and rebalance the workload across the nodes.

Redis does not implement a coherent cluster membership and does not provide automatic self-healing and recovery. Each Redis node sends heartbeat messages to random other nodes to detect possible failures, and the cluster uses a gossip mechanism to declare that a node has failed. After that, its replica on a different node promotes itself to a primary so that the hash slots remain available, but Redis does not self-heal by creating a new replica for the hash slots. Also, it does not automatically redistribute the hash slots across the nodes to rebalance the workload. These tasks are left to the system administrator, who needs to sort out the needed configuration changes and implement them to restore a fully redundant, balanced cluster.

Performance Comparison

The different design choices between SOSS and Redis also lead to semantic and performance differences. To maximize ease of use for application developers, SOSS maintains all stored data with full consistency (to be more precise, sequential consistency), ensuring that it only serves the latest updates and never loses data after the failure of a single server (or two servers if multiple replicas are used). This design choice targets enterprise applications that need to ensure that the distributed cache always returns the correct data. To implement data replication across multiple replicas with the highest possible performance, SOSS uses a patented quorum algorithm.

In contrast, Redis employs an eventual consistency model with asynchronous replication. In general, this choice enables higher throughput because updates do not have to wait for replication to complete before responding to the user. It also enables potentially higher read throughput by serving reads from replicas even if they are not guaranteed to serve the latest updates.

Given these two design choices, it’s valuable to compare the throughput of the two distributed caches as nodes are added and the workload is simultaneously increased, as illustrated below. This technique evaluates how well the caches can scale their throughput by adding nodes to handle increasing workload; linear throughput scaling ensures consistently fast response times. (For a discussion of throughput scaling in distributed systems, see Gustafson’s Law.).

To perform an apples-to-apples throughput comparison of Redis 6.2 and SOSS 5.10, SOSS was configured to use eventual consistency (“EC”) when updating replicas. The performance of SOSS with full consistency (“FC”) was also measured. Tests were run for 3, 4, and 6 node clusters in AWS on m5.xlarge instances with 4 cores@2.5 Ghz, and 16GB RAM. The clients ran read/update pairs on 100K objects of sizes 2KB and 20KB to represent a typical web workload with a 1:1 read/update ratio. The results are as follows:

SOSS provided consistently higher throughput than Redis when eventual consistency was used to perform updates (the blue and gray lines in the charts). Running SOSS with full consistency (the red lines) resulted in lower throughput, as expected, since updates have to be committed at the replica before responding to the client instead of being performed asynchronously. However, both Redis and SOSS with full consistency delivered close to the same throughput for 20KB objects. This may be due to benefits of SOSS’s client-side caching, which eliminated unnecessary data transfers during reads.

Summing Up

Our comparison of SOSS and Redis shows the benefits of ScaleOut’s integrated clustering architecture. A key design goal for SOSS was to simplify the user’s workload by providing a unified, location-transparent data cache with built-in, fully automatic load-balancing and high availability. By hiding the inner workings of hash slots, heart-beating, replica placement, load-balancing, and self-healing, the application developer and systems administrator can focus on simply using the distributed cache instead of configuring its implementation. In our view, Redis’s approach of exposing these complex mechanisms to the user significantly steepens the learning curve and increases the user’s workload.

It might come as a surprise to learn that in the above benchmark testing, SOSS maintained a consistent performance advantage. We attribute this to ScaleOut’s approach of designing an integrated cluster architecture from the outset instead of adding clustering to a single server data store, as Redis did. This approach enabled design freedom at every step to eliminate distributed bottlenecks, and it led to extensive use of multithreading and internal data sharding within each service process to extract maximum performance from multi-core servers.

Lastly, SOSS demonstrates that the CAP theorem doesn’t really prevent the use of full consistency when building a scalable, distributed cache. For many enterprise applications, which demand data integrity at all times, this may be the better choice.

Learn more about how ScaleOut StateServer compares to Redis.

*Redis is a registered trademark of Redis Ltd. and the Redis box logo is a mark of Redis Ltd. Any rights therein are reserved to Redis Ltd. Any use by ScaleOut Software is for referential purposes only and does not indicate any sponsorship, endorsement or affiliation between Redis and ScaleOut Software.

The post Redis vs ScaleOut: What You Need to Know appeared first on ScaleOut Software.

]]>

Deploying Real-Time Digital Twins On Premises with ScaleOut StreamServer DT

Tue, 06 Apr 2021 13:00:47 +0000

With the ScaleOut Digital Twin Streaming Service, an Azure-hosted cloud service, ScaleOut Software introduced breakthrough capabilities for streaming analytics using the real-time digital twin concept. This new software model enables applications to easily analyze telemetry from individual data sources in 1-3 milliseconds while maintaining state information about data sources that deepens introspection. It also provides a basis for applications to create key status information that the streaming platform aggregates every few seconds to maximize situational awareness. Because it runs on a scalable, highly available in-memory computing platform, it can do all this simultaneously for hundreds of thousands or even millions of data sources.

The unique capabilities of real-time digital twins can provide important advances for numerous applications, including security, fleet telematics, IoT, smart cities, healthcare, and financial services. These applications are all characterized by numerous data sources which generate telemetry that must be simultaneously tracked and analyzed, while maintaining overall situational awareness that immediately highlights problems of concern an/or opportunities of interest. For example, consider some of the new capabilities that real-time digital twins can provide in fleet telematics and vaccine distribution during COVID-19.

To address security requirements or the need for tight integration with existing infrastructure, many organizations need to host their streaming analytics platform on-premises. Scaleout StreamServer® DT was created to meet this need. It combines the scalable, battle-tested in-memory data grid that powers ScaleOut StreamServer with the graphical user interface and visualization features of the cloud service in a unified, on-premises deployment. This gives users all of the capabilities of the ScaleOut Digital Twin Streaming Service with complete infrastructure control.

As illustrated in the following diagram, ScaleOut StreamServer DT installs its management console on a standalone server that connects to ScaleOut StreamServer’s in-memory data grid. This console hosts the graphical user interface that is securely accessed by remote workstations within an organization. It also deploys real-time digital twin models to the in-memory data grid, which hosts instances of digital twins (one per data source) and runs application-defined code to process incoming messages. Message are delivered to the grid using messaging hubs, such as Azure IoT Hub, AWS IoT Core, Kafka, a built-in REST service, or directly using APIs.

The management console installs as a set of Docker containers on the management server. This simplifies the installation process and ensures portability across operating systems. Once installed, users can create accounts to control access to the console, and all connections are secured using SSL. The results of aggregate analytics and queries performed within the in-memory data grid can then be accessed and visualized on workstations running throughout an organization.

Because ScaleOut’s in-memory data grid runs in an organization’s data center and avoids the requirement to use a cloud-hosted message hub or REST service, incoming messages from data sources can be processed with minimum latency. In addition, application code running in real-time digital twins can access local resources, such as databases and alerting systems, with the best possible performance and security. Use of dedicated computing resources for the in-memory data grid delivers the highest possible throughput for message processing and real-time analytics.

While cloud hosting of streaming analytics as a SaaS (software-as-a-service) offering creates clear advantages in reducing capital costs and providing access to highly elastic computing resources, it may not be suitable for organizations which need to maintain full control of their infrastructures to address security and performance requirements. ScaleOut StreamServer DT was designed to meet these needs and deliver the important, unique benefits of streaming analytics using real-time digital twins to these organizations.

The post Deploying Real-Time Digital Twins On Premises with ScaleOut StreamServer DT appeared first on ScaleOut Software.

]]>

ScaleOut Software Announces the Availability of ScaleOut GeoServer® Pro

Tue, 17 Nov 2020 15:48:41 +0000

New Product Release Uses In-Memory Data Storage to Combine High Availability and Synchronized Access Across Multiple Sites

BELLEVUE, Wash – November 17, 2020 – ScaleOut Software today announced ScaleOut GeoServer® Pro, a new software product release that integrates site-to-site data replication with fully coherent data access for its battle-tested ScaleOut StateServer® in-memory data grid (IMDG) and distributed cache. This release extends the company’s ScaleOut GeoServer® DR product, which provides asynchronous, site-to-site data replication to protect against site-wide failures and currently is in production use.

For more than fifteen years, ScaleOut StateServer has set the standard for high performance reliability and industry-leading ease of use at hundreds of enterprise sites around the world. The product stores fast-changing data in a wide variety of applications, including ecommerce, financial services, online learning, airline reservations, gaming and much more.

“With the release of ScaleOut GeoServer Pro, we are excited to offer our customers breakthrough capabilities for multi-site storage of their fast-changing data,” said Dr. William L. Bain, founder and CEO of ScaleOut Software. “Now they can take advantage of our industry-leading technology that replicates data across sites to protect against data center failures while making fully coordinated use of the sites.”

By harnessing ScaleOut GeoServer Pro, users can now take in-memory data storage and distributed caching to the next level with an integrated solution for disaster recovery and synchronized data access across multiple sites. This technology enables applications to both protect against site-wide failures and to maintain a consistent view of data stored at all data centers.

Key ScaleOut GeoServer Pro Benefits:

ScaleOut GeoServer Pro enables organizations to store, access and protect fast-changing data at multiple sites, while maintaining a consistent view of the data at all times. The technology ensures that critical data is always accessible and synchronized across locations.

Transparent Replication Across Data Centers: Applications can automatically replicate all stored data across multiple data centers for continuous availability in case a data center fails. This includes replicating in-memory data across different cloud regions, while automatically coordinating access to data stored at these sites.
Integrated, Synchronized Data Access: ScaleOut GeoServer Pro introduces optional, synchronized access to replicated data across multiple data centers. This enables applications that distribute workloads across data centers to maintain a straightforward, unified view of stored data. For example, web applications which use a global load balancer to share user data held in two data centers can now use both data centers in an “active-active” configuration.
Automatic Recovery from WAN Failures: In the event of a WAN failure between data centers, applications can independently access data at each data center. When WAN connectivity is re-established, ScaleOut GeoServer Pro automatically resolves inconsistencies in stored data due to duplicate updates during the WAN outage known as a “split-brain” condition.
Maximize Application Performance: To maximize overall performance, ScaleOut GeoServer Pro’s caches offer flexible coherency policies so that applications can select the appropriate combination of coherency and access latency according to data usage. This avoids unnecessary WAN data usage and associated delays.

Additional Resources:

For more information about ScaleOut GeoServer Pro, please visit:

ScaleOut GeoServer Pro announcement blog post
ScaleOut GeoServer Pro product page

About ScaleOut Software

For more information, please visit www.scaleoutsoftware.com and follow @scaleout_inc

###

Contact:

RH Strategic for ScaleOut Software

ScaleOutPR@rhstrategic.com

The post ScaleOut Software Announces the Availability of ScaleOut GeoServer® Pro appeared first on ScaleOut Software.

]]>

Using Real-Time Digital Twins for Corporate Contact Tracing

Tue, 25 Aug 2020 13:00:23 +0000

A Demo Application Shows How Companies Can Track COVID-19 Contacts Within Companies

Until a COVID-19 vaccine is widely available, getting back to work means keeping a close watch for outbreaks and quickly containing them when they occur. While the prospects for accomplishing this within large companies seem daunting, tracking contacts between employees may be much easier than for the public at large. This blog post explains how a software application built with a new software construct called real-time digital twins makes this possible.

Tracking Employees Using Real-Time Digital Twins

In an earlier blog post, we saw how real-time digital twins running in the ScaleOut Digital Twin Streaming Service can be used to track employees within a large company using a technique called “voluntary self-tracing.” In this post, we’ll take a closer look at its implementation in a demo application created by ScaleOut Software. We’ll also look at a companion mobile app that allows employees to log contacts with colleagues outside their immediate teams and to notify the company and their contacts if they test positive for COVID-19.

The demo application creates a memory-based real-time digital twin for each employee. Using information from the company’s organizational database, it populates each twin with the employee’s ID, team ID, department type, and location. The twin also keeps a list of the employee’s contacts within the organization (as well as community contacts, discussed below). This allows immediate colleagues and their contacts to be notified if an employee tests positive. The following diagram illustrates an employee’s real-time digital twin and the state data it holds; details about the contact tracing code are explained below:

The twin automatically populates its contact list with the other members of the employee’s team, based on the expectation that team members are in daily contact. Using the mobile app, employees can log one-time and recurring contacts with colleagues in other teams, possibly at different office locations. In addition, they can log contacts outside the company, such as taxi rides, airline flights, and meals at restaurants, so that community members can be notified if an employee was exposed to COVID-19.

An employee can use the mobile app to notify their real-time digital twin of a positive test for COVID-19. Code running in the twin then sends messages to the real-time digital twins for all contacts in the employee’s list. These twins in turn send messages to their contacts, and so on, until the twins for all contacts have been notified. (The algorithm avoids unnecessary messages to team members and circular paths among twins.) The twin then sends a push notification to each affected employee through the mobile app, alerting them to the possible exposure and the number of intermediate contacts between themselves and the infected person. Because real-time digital twins are hosted in memory, all of this happens within seconds, enabling affected employees to immediately self-quarantine and obtain COVID-19 tests.

Here’s an illustration of the chain of contacts originating with an employee who reports testing positive. (Note that the outbound notifications from the twins to the employees’ mobile devices are not shown here.)

What’s in the Real-Time Digital Twin?

As illustrated in the first diagram, each real-time digital twin hosts two components, state data and a message-processing method. These are defined by the contact tracing application and can be written in C#, Java, or JavaScript. (C# was used for the demo application.) The state data is unique for each employee and contains the employee’s information and contact list, along with useful statistics, such as how often the employee has been alerted about a possible exposure. The message-processing method’s code is shared by all twins. It receives messages from the mobile app or from other twins (each corresponding to a single employee) and uses application-defined code to process these messages.

Messages from the mobile app can request to add or remove a contact from the list. For new contacts, they include parameters such as the employee ID of the contact and whether the contact will be recurring. (Users also can record contacts using calendar events.) Messages from the mobile app can also request the current contact list for display, signal that the employee has tested positive or negative, and request current notifications. Messages from other real-time digital twins signal that the corresponding employees have been exposed and provide additional information, such as the number of intermediate contacts and the location of the initial employee who tested positive.

The application’s message-processing code responds to these messages and implements the spanning-tree notification algorithm that alerts other twins on the contact list. The streaming service handles the rest, namely the details of message delivery, retrieval and updating of state information, and managing the execution platform.

Using the Mobile App

The following animated diagram shows how an employee can add a contact with a company colleague outside of their immediate team or with a community contact during business travel (left screenshot). If the employee tests positive, the employee can use the mobile app to report this to the company (middle screenshot). All employees are then notified using the mobile app, as shown in the right screenshot. Community contacts are reported to managers who communicate with outside points of contact, such as airlines, taxi companies, and restaurants.

Using Aggregate Statistics to Spot Outbreaks

The streaming service has the built-in capability to aggregate state data from all real-time digital twins. The service then displays the results in charts which are recalculated every few seconds. These charts enable managers to identify emerging issues, such as an outbreak within a specific department or site. With this information, they can take immediate steps to contain the outbreak and minimize the number of affected employees.

To illustrate the value of aggregate statistics in boosting situational awareness, consider a hypothetical company with 30,000 employees and offices in several states across the U.S. Suppose an employee at the Texas site suddenly tests positive. This could be immediately alerted to managers with the following chart generated and continuously updated by the streaming service, which shows all employees who have tested positive:

Within a few seconds, the real-time digital twins notify all points of contact. Updates to state data are immediately aggregated in another chart that shows the sites where employees have been notified of a positive contact and the number of employees affected at each site:

This chart shows that about 140 employees in three states were notified and possibly exposed directly or indirectly. All of these employees are then immediately quarantined to contain the possible spread of COVID-19. After an investigation by company managers, it is determined that the employee had business travel to Arizona and met with a team that subsequently had business travel to California. Instead of taking hours or days to uncover the scope of a COVID-19 exposure, contact tracing using real-time digital twins alerts managers within seconds.

The real-time digital twins can collect additional useful statistics for visualization by the streaming service. Another chart can show the average number of intermediate contacts for all notified employees, which is an indication of how widely employees have been interacting across teams. If this becomes an issue (as it is in the above example), managers can implement policies to further isolate teams. As shown below, a chart can also show the number of notified employees by department so that managers can determine whether certain departments, such as retail outlets, need stricter policies to limit exposure to COVID-19 from outside contacts.

The Benefits of an Integrated Streaming Service

This contact tracing application demonstrates the power of real-time digital twins to enable fast application development with compelling benefits. Because the amount of application code is small, real-time digital twins can be quickly written and tested. (See a recent blog post which describes how to simplify debugging and testing using a mock environment prior to deployment in the cloud.) They also can be easily modified and updated.

The ScaleOut Digital Twin Streaming Service provides the execution platform so that the application code does not have to deal with message distribution, state saving, performance scaling, and high availability. It also includes support for real-time aggregate analytics and visualization integrated with the real-time digital twin model to maximize ease of use.

Compare this approach to the complexity of building out an application server farm, database, analytics application, and visualization to accomplish the same goals at higher cost and lower performance. Cobbling together these diverse technologies would require several skill sets, lengthy development time, and higher operational costs.

Summing Up

This demo contact tracing application was designed to show how companies can take advantage of their organizational structures to track contacts among employees and quickly notify all affected employees when an individual tests positive for COVID-19. By responding quickly to an exposure with immediate, comprehensive information about its extent within the company (and with community contacts), managers can limit the exposure’s impact. The application also shows how the real-time digital twin model enables a quick, agile implementation which can be easily adapted to the specific needs of a wide range of companies.

Please contact us at ScaleOut Software to learn more about this demo application for limiting the impact of COVID-19 and other ways real-time digital twins can help your company monitor and respond to fast-changing events.

The post Using Real-Time Digital Twins for Corporate Contact Tracing appeared first on ScaleOut Software.

]]>

The Power of Integrated Analytics Within an IMDG

Tue, 21 Jul 2020 12:00:00 +0000

ScaleOut StateServer® Pro Adds Analytics to In-Memory Data Grids

In-Memory Data Grids for Fast-Changing Data

For more than fifteen years, ScaleOut StateServer® has demonstrated technology leadership as an in-memory data grid (IMDG) and distributed cache. Designed to help scalable applications deliver high performance, it stores live, fast-changing data in memory (DRAM) for fast updates and retrieval. By transparently distributing stored objects across a cluster of servers (physical or virtual), it automatically scales performance for fast-growing workloads and maintains consistently low access latency. Typical uses include storing session-state and ecommerce shopping carts, product descriptions, airline reservations, financial portfolios, news stories, online learning data, and many others.

From its inception, the design philosophy behind ScaleOut StateServer has been to simultaneously maximize both performance and ease of use. Because IMDGs have complex internal mechanisms, they need to automate them as much as possible so that the developer can just focus on application concerns and not on the inner workings of the IMDG. For this reason, the product incorporates features such as automatic discovery of servers, transparent load-balancing when servers are added to the cluster or removed, automatic data replication for high availability with transparent placement of replicas, quorum-based updating of replicas to ensure consistency, integrated client libraries, and coherent client-side caching. The net effect is that applications maintain a straightforward view of the IMDG as a unified key/value store for serialized application objects.

The Challenges with Parallel Queries

Although IMDGs are optimized for key-based access, applications often need to retrieve groups of objects with matching properties. For example, if an application is storing shopping carts, it might be useful to find all shopping carts with a total value that exceeds a specified threshold so that these shoppers can be given special attention. To this end, ScaleOut StateServer incorporates a property-based, distributed query API that returns a collection of matching objects. To simplify development for .NET applications, it uses Microsoft’s language integrated query (LINQ) to specify queries. (Java applications use a similar mechanism.)

Queries in an IMDG can create interesting performance challenges. Because IMDGs have highly scalable storage capacity, they can easily return large numbers of matching objects to the client application, and this leads to network bottlenecks transferring large amounts of data from the IMDG back to the client. Once all objects are delivered, the client is then faced with the task of analyzing potentially huge numbers of objects. This can saturate the client’s CPU and delay responses, as illustrated in the following diagram:

ScaleOut StateServer Pro: Integrated Data Analytics

To address these challenges, ScaleOut Software has introduced ScaleOut StateServer Pro, an advanced version of ScaleOut StateServer that integrates data analytics within the IMDG. Instead of querying objects from the IMDG and analyzing them in the client, applications can now simply run this analysis within the IMDG itself using APIs available in ScaleOut StateServer Pro. Because all the work is performed with the IMDG, this has the two-fold advantage of offloading both the network and the client’s CPU. It also transparently makes use of the IMDG’s scalable computing resources to accelerate the analysis.

Take a look at how integrated data analytics can help client applications. In the following illustration, the client library sends the application’s analysis method (“Analyze”) to the IMDG for execution in parallel on all shopping carts selected by a query. The results are combined within the IMDG and returned to the application:

Keeping with the design philosophy of maximizing both performance and ease of use, ScaleOut StateServer Pro lets developers easily construct data analytics by specifying an object-oriented method that analyzes each matching object selected by a query and a second method for combining the results. In .NET applications, this data-parallel execution structure can be described using a distributed version of Microsoft’s popular Parallel.ForEach API, which ScaleOut StateServer Pro integrates with LINQ query. Application code is automatically shipped by the client library to the IMDG for execution and runs fully in parallel across all servers for maximum performance.

Consider the above example of querying shopping carts exceeding a threshold value. Suppose the application’s goal is to periodically analyze high value shopping carts to make upsell offers based on the contents of each cart. Instead of querying the IMDG and returning thousands of shopping carts to the client, the application can implement a method which analyzes these carts within the IMDG to determine which carts should receive upsell offers (and possibly determine which upsell offers to make). This analysis runs in parallel within the IMDG and then returns its results to the client for further action. This dramatically reduces the workload on the network and client, and it ensures consistently high performance.

Summing Up: Extracting Maximum Value from an IMDG

Since their inception, IMDGs have to a large extent been underutilized by viewing them as passive key/value stores. Because they are actually designed as a data-parallel execution platform, they can do much more than just store and serve memory-hosted, live data. They also can perform analysis quickly and efficiently — where the data lives.

Taking full advantage of this powerful capability requires just a shift in thinking about where application work should be performed. In many cases, it’s a much better choice to analyze data within the IMDG instead of transferring it to the client for analysis. ScaleOut StateServer Pro makes it easy to do just that, and it delivers fast, scalable performance. Now developers can finally extract full value from their IMDGs.

The post The Power of Integrated Analytics Within an IMDG appeared first on ScaleOut Software.

]]>

The Amazing Evolution of In-Memory Computing

Mon, 22 Jun 2020 23:31:30 +0000

From Distributed Caches to Real-Time Digital Twins

Going back to the mid-1990s, online systems have seen relentless, explosive growth in usage, driven by ecommerce, mobile applications, and more recently, IoT. The pace of these changes has made it challenging for server-based infrastructures to manage fast-growing populations of users and data sources while maintaining fast response times. For more than two decades, the answer to this challenge has proven to be a technology called in-memory computing.

In general terms, in-memory computing refers to the related concepts of (a) storing fast-changing data in primary memory instead of in secondary storage and (b) employing scalable computing techniques to distribute a workload across a cluster of servers. Assuming bottlenecks are avoided, this enables transparent throughput scaling that matches an increase in workload, which in turn keeps response times low for users. It can also take advantage of the elastic computing resources available in cloud infrastructures to quickly and cost-effectively scale throughput to meet changes in demand.

Harnessing the power of in-memory computing requires software platforms that can make in-memory computing’s scalability readily available to applications using APIs while hiding the complexity of its implementation. Emerging in the early 2000s, the first such platforms provided distributed caching on clustered servers with straightforward APIs for storing and retrieving in-memory objects. When first introduced, distributed caching offered a breakthrough for applications by storing fast-changing data in memory on a server cluster for consistently fast response times, while simultaneously offloading database servers that would otherwise become bottlenecked. For example, ecommerce applications adopted distributed caching to store session-state, shopping carts, product descriptions, and other data that shoppers need to be able to access quickly.

Software platforms for distributed caching, such as ScaleOut StateServer®, which was introduced in 2005, hide internal mechanisms for cluster membership, throughput scaling, and high availability to take full advantage of the cluster’s scalable memory without adding complexity to applications. They transparently distribute stored objects across the cluster’s servers and ensure that data is not lost if a server or network component fails.

As distributed caching has evolved over the last two decades, additional mechanisms for in-memory computing have been incorporated to take advantage of the computing power available in the server cluster. Parallel query enables stored objects on all servers to be scanned simultaneously to retrieve objects with desired properties. Data-parallel computing analyzes objects on the server cluster to extract and report patterns of interest; it scales much better than parallel query by avoiding network bottlenecks and by using the cluster’s computing resources.

Most recently, stream-processing has been implemented with in-memory computing to simultaneously analyze telemetry from thousands or even millions of data sources and track dynamic state information for each data source. ScaleOut Software’s real-time digital twin model provides straightforward APIs for implementing stream-processing applications within its ScaleOut Digital Twin Streaming Service, an Azure-based cloud service, while hiding internal mechanisms, such as distributing incoming messages to in-memory objects, updating state information for each data source, and running aggregate analytics.

The following diagram shows the evolution of in-memory computing from distributed caching to stream-processing with real-time digital twins. Each step in the evolution has built on the previous one to add new capabilities that take advantage of the scalable computing power and fast data access that in-memory computing enables.

For ecommerce applications, this evolution has created new capabilities that dramatically improve the experience for online shoppers. Instead of just passively hosting session-state and shopping carts, online applications now can mine shopping carts for dynamic trends to evaluate the effectiveness of product descriptions and marketing programs (such as flash sales). They can also employ real-time digital twins or similar techniques to track each shopper’s behavior and make recommendations. By analyzing a click-stream of product selections in the context of knowledge of a shopper’s preferences and demographics, an ecommerce site can make highly focused recommendations to assist the shopper.

For example, one of ScaleOut Software’s customers recently upgraded from just using distributed caching to scale its ecommerce application. This customer now incorporates stream-processing capabilities using ScaleOut StreamServer® to capture click-streams and score users so that its web site can make more effective real-time offers.

The following diagram illustrates how the evolution of in-memory computing has enhanced the online experience for ecommerce shoppers by providing in-the-moment suggestions:

Starting with its development for parallel supercomputing in the late 1990s and evolving into its latest form as a cloud-based service, in-memory computing has offered powerful, software based APIs for building applications that serve large populations of users and data sources. It has helped assure that these applications deliver predictably fast performance and scale to meet the demands of growing workloads. In the next few years, we should continued innovation from in-memory computing to help ecommerce and other applications maintain their competitive edge.

The post The Amazing Evolution of In-Memory Computing appeared first on ScaleOut Software.

]]>

Announcing the ScaleOut Digital Twin Streaming Service™

Tue, 19 May 2020 12:00:19 +0000

Today ScaleOut Software announces the release of its ground-breaking cloud service for streaming analytics using the real-time digital twin model. It’s called the ScaleOut Digital Twin Streaming Service, and it’s now available for production use. Sign up to use the service here.

A major challenge for stream-processing applications that track numerous data sources in real time is to analyze telemetry relevant to each specific data source and combine this with dynamic, contextual information about the data source to enable immediate action when necessary. For example, heart-rate telemetry from a smart watch cannot be effectively evaluated in isolation. Instead, it needs to be combined with knowledge of each person’s age, health, medications, and activity to determine when an alert should be generated.

A second and equally daunting challenge for live systems is to maintain real-time situational awareness about the state of all data sources so that strategic responses can be implemented, especially when a rapid sequence of events is unfolding. Whether it’s a rental car fleet with 100K vehicles on the road or a power grid with 40K nodes subject to outages, system managers need to quickly identify the scope of emerging problems and rapidly focus resources where most needed.

Traditional platforms for streaming analytics attempt to look at the entire telemetry pipeline using techniques such as SQL query to uncover and act on patterns of interest. But this approach is complex and leads to superficial analysis in real time, forcing telemetry to be logged into a data lake for later analysis using Spark or other tools. How do you trigger an alert to the wearer of a smart watch at the exact moment that the combination of telemetry fluctuations and knowledge about the individual’s health indicate that an alert is needed?

The key to creating straightforward stream-processing applications that can deal with these challenges lies in a software concept called the “real-time digital twin model.” Borrowed from its use in the field of product life-cycle management, real-time digital twins host application code that analyzes incoming telemetry (event messages) from each individual data source and maintains dynamically evolving information about the data source. This approach refactors and simplifies application code (which can be written in standard Java, C#, or JavaScript) to just focus on a single data source, introspect deeply, and better predict important events.

The following diagram illustrates how the ScaleOut Digital Twin Streaming Service hosts real-time digital twins that receive telemetry from individual data sources and send responses, including commands and alerts:

Because real-time digital twins maintain and dynamically update key information about each data source, aggregate analytics — essentially, continuous queries — can continuously look for patterns in this curated data instead of in just the raw telemetry. This enables immediate, focused insights that enhance situational awareness. For example, the streaming service can generate a bar chart every few seconds to aggregate and highlight alerts by region generated by examining properties of real-time digital twins for thousands of data sources:

The ScaleOut Digital Twin Streaming Service plugs into popular event hubs, such as Azure IoT Hub, AWS IoT Core, and Kafka, to extract event messages and forward them to real-time digital twin instances, one for each data source. It then triggers application code to process the messages and gives it immediate access to memory-based contextual information for the data source. Application code can generate alerts, command devices, update the contextual information, and read or update databases as needed. This code can be thought of as similar to a serverless function with the major distinction that it is automatically supplied contextual information and does not have to maintain it in an external data store.

This highly scalable cloud service is designed to simultaneously and cost-effectively track telemetry from millions of data sources and provide real-time feedback in milliseconds while simultaneously performing continuous, aggregate analytics every few seconds. A powerful UI enables fast deployment of real-time digital twin models created using the ScaleOut Digital Twin Builder software toolkit. The UI lets users build graphical widgets which create and chart aggregate statistics. Under the floor, a powerful in-memory data grid with an integrated compute engine transparently ensures fast, predictable performance.

Given the current COVID-19 crisis, here’s a use case in which the streaming service can assist in prioritizing the distribution of critical medical supplies to the nation’s hospitals. Hospitals distributed across the United States can send status updates to the cloud service regarding their shortfall of supplies such as ventilators and personal protective equipment. Within milliseconds, a dedicated real-time digital twin instance for each hospital can analyze incoming messages to track and evaluate the need for supplies, determine the hospital’s overall shortfall, and assess the urgency for immediate action, as depicted below:

The streaming service can then simultaneously analyze these results across the population of digital twin instances to determine in seconds which regions are exhibiting the most critical shortfall. This alerts logistics managers, who can then query the digital twins to identify specific hospitals and implement a strategic response:

The real-time digital twin approach creates a breakthrough for application developers that both simplifies application development and enhances introspection. It’s ideal for a wide range of applications, including real-time intelligent monitoring (the example above), Industrial Internet of Things (IIoT), logistics, security and disaster recovery, e-commerce recommendations, financial services, and much more. The ScaleOut Digital Twin Streaming Service is available now. We invite interested users to contact us here to learn more.

The post Announcing the ScaleOut Digital Twin Streaming Service™ appeared first on ScaleOut Software.

]]>

Real-Time Digital Twins Simplify Development

Wed, 06 May 2020 19:11:13 +0000

The Challenge: Track Thousands of Data Sources

Writing applications for streaming analytics can be complex and time consuming. Developers need to write code that extracts patterns out of an incoming telemetry stream and take appropriate action. In most cases, applications just export interesting patterns for visualization or offline analysis. Existing programming models make it too difficult to perform meaningful analysis in real time.

This obstacle clearly presents itself when tracking the behavior of large numbers of data sources (thousands or more) and attempting to generate individualized feedback within several milliseconds after receiving a telemetry message. There’s no easy way to separately examine incoming telemetry from each data source, analyze it in the context of dynamically evolving information about the data source, and then generate an appropriate response.

An Example: Contact Self-Tracing

For example, consider the contact self-tracing application described in a previous blog. This application tracks messages from thousands of users logging risky contacts who might transmit the COVID19 virus to them. For each user, a list of contacts must be maintained. If a user signals the app that he or she has tested positive, the app successively notifies chained contacts until all connected users have been notified.

Implementing this application requires that state information be maintained for each user (such as the contact list) and updated whenever a message from that user arrives. In addition, when an incoming message signals that a user has tested positive, contact lists for all connected users must be quickly accessed to generate outgoing notifications.

In addition to this basic functionality, it would be highly valuable to compute real-time statistics, such as the number of users reporting positive by region, the average length of connection chains for all contacts, and the percentage of connected users who also report testing positive.

A Conventional Implementation

This application cannot be implemented by most streaming analytics platforms. As illustrated below, it requires standing up a set of cooperating services (encompassing a variety of skills), including:

A web service to process incoming messages (including notifications) by making calls to a backend database
A database service to host state information for each user
A backend analytics application (for example, a Spark app) that extracts information from the database, analyzes it, and exports it for visualization
A visualization tool that displays key statistics

Implementing and integrating these services requires significant work. After that, issues of scaling and high availability must be addressed. For example, how do we keep the database service from becoming a bottleneck when the number of users and update rate grow large? Can the web front end process notifications (which involve recursive chaining) without impacting responsiveness? If not, do we need to offload this work to an application server farm? What happens if a web or application server fails while processing notifications?

Enter the Real-Time Digital Twin

The real-time digital twin (RTDT) model was designed to address these challenges and simplify application development and deployment while also tackling scaling, high availability, real-time analytics, and visualization. This model uses standard object-oriented techniques to let developers easily specify both the state information to be maintained for each data source and the code required to process incoming messages from that data source. The rest is automatically handled by the hosting platform, which makes use of scalable, in-memory computing techniques that ensure high performance and availability.

Another big plus of this approach is that the state information for each instance of an RTDT can be immediately analyzed to provide important statistics in real time. Because this live state information is held in memory distributed across of a cluster of servers, the in-memory computing platform can analyze it and report results every few seconds. There’s no need to suffer the delay and complexity of copying it out to a data lake for offline analysis using an analytics platform like Spark.

The following diagram illustrates the steps needed to develop and deploy an RTDT model using the ScaleOut Digital Twin Streaming Service, which includes built-in connectors for exchanging messages with data sources:

Implementing Contact Self-Tracing Using a Real-Time Digital Twin Model

Let’s take a look at just how easy it is to build an RTDT model for tracking users in the contact self-tracking application. Here’s an example in C# of the state information that needs to be tracked for each user:

public class UserTwin : DigitalTwinBase
{
    public string Alias;
    public string Email;
    public string MobileNumber;
    public Status CurrentStatus;  // Normal, TestedPositive, or Notified
    public int NumHopsIfNotified;
    public List Contacts;
    public Dictionary Notifiers;
}

This simple set of properties is sufficient to track each user. In additional to phone and/or email, each user’s current status (i.e., normal, reporting tested positive, or notified of a contact who has tested positive) is maintained. In case the user is notified by another contact, the number of hops to the connected user who reported tested positive is also recorded to help create statistics. There’s also a list of immediate contacts, which is updated whenever the user reports a risky interaction. Lastly, there’s a dictionary of incoming notifications to help prevent sending out duplicates.

The RTDT stream-processing platform automatically creates an instance of this state object for every user when the an account is created. The user can then send messages to the platform to either record an interaction or report having been tested positive. Here’s the application code required to process these messages:

foreach (var msg in newMessages)
{
    switch (msg.MsgType)
    {
        case MsgType.AddContact:
            newContact = new Contact();
            newContact.Alias = msg.ContactAlias;
            newContact.ContactTime = DateTime.UtcNow;
            dt.Contacts.Add(newContact);
            break;

        case MsgType.SignalPositive:
            dt.CurrentStatus = Status.SignaledPositive;

            // signal all of our contacts that we have tested positive:
            notifyMsg = new UserTwinMessage();
            notifyMsg.Id = dt.Alias;
            notifyMsg.MsgType = MsgType.Notify;
            notifyMsg.ContactAlias = dt.Alias;
            notifyMsg.ContactTime = DateTime.UtcNow;
            notifyMsg.NumHops = 0;

            foreach (var contact in dt.Contacts)
            {
                msgResult = context.SendToTwin("UserTwin", contact.Alias,
                                               notifyMsg);
            }
            break;
}}

Note that when a user signals that he or she has tested positive, this code sends a Notify message to all RTDT instances corresponding to users in the contact list. Handling this message type requires one more case to the above switch statement, as follows:

case MsgType.Notify:
    if (dt.CurrentStatus != Status.SignaledPositive)
        dt.CurrentStatus = Status.Notified;

    // if we have already heard from the root contact, ignore the message:
    if (msg.ContactAlias == dt.Alias || dt.Notifiers.ContainsKey(msg.ContactAlias))
        break;

    // otherwise, add the notifier and signal the user:
    else
    {
        if (dt.NumHopsIfNotified == 0)
            dt.NumHopsIfNotified = msg.NumHops;

        newContact = new Contact();
        newContact.Alias = msg.ContactAlias;
        newContact.ContactTime = msg.ContactTime;
        newContact.NumHops = msg.NumHops + 1;
        dt.Notifiers.Add(msg.ContactAlias, newContact);

        notifyMsg = new UserTwinMessage();
        notifyMsg.Id = dt.Alias;
        notifyMsg.MsgType = MsgType.Notify;
        notifyMsg.ContactAlias = msg.ContactAlias;
        notifyMsg.ContactTime = msg.ContactTime;
        notifyMsg.NumHops = msg.NumHops + 1;
    
        msgResult = context.SendToDataSource(notifyMsg);

        // finally, notify all our contacts except the root if it's a contact:

        foreach (var contact in dt.Contacts)
        {
            if (contact.Alias != msg.ContactAlias)
                msgResult = context.SendToTwin("UserTwin", contact.Alias,
                                               notifyMsg);
        }
    }                                      
    break;

That’s all there is to it. The key point is that the amount of code required to implement this application is small. Compare that to standing up web, application, database, and analytics services. In addition, integrated, real-time analytics within the RTDT platform can examine state variables to easily generate and visualize key statistics. For example, the CurrentStatus property and a Region property (not shown here) can be used to determine the average number of users who have tested positive by region. Likewise, the NumHopsIfNotified property can be used to determine the average number of connections traversed to notify users.

Summing Up
There’s no doubt that it’s a daunting challenge to create streaming analytics applications that track large numbers of data sources and respond individually to each one while simultaneously generating aggregate statistics that help maintain situational awareness. As we have seen, real-time digital twins can cut through this complexity and enable powerful applications to be quickly built with minimal code. This simplicity also makes them “agile” in the sense that they can be easily modified or extended to handle evolving requirements. You can find detailed information here to help you learn more.

The post Real-Time Digital Twins Simplify Development appeared first on ScaleOut Software.

]]>

Use Distributed Caching to Accelerate Online Web Sites

Wed, 22 Apr 2020 19:16:54 +0000

The Challenge: Keeping Online Sites Fast

In this time of extremely high online usage, web sites and services have quickly become overloaded, clogged trying to manage high volumes of fast-changing data. Most sites maintain a wide variety of this data, including information about logged-in users, e-commerce shopping carts, requested product specifications, or records of partially completed transactions. Maintaining rapidly changing data in back-end databases creates bottlenecks that impact responsiveness. In addition, repeatedly accessing back-end databases to serve up popular items, such as product descriptions and news stories, also adds to the bottleneck.

The Solution: Distributed Caching

The solution to this challenge is to use scalable, memory-based data storage for fast-changing data so that web sites can keep up with exploding workloads. A widely used technology called distributed caching meets this need by storing frequently accessed data in memory on a server farm instead of within a database. This speeds up accesses and updates while offloading back-end database servers. Also called in-memory data grids, distributed caches, such as ScaleOut StateServer®, use server farms to both scale storage capacity and accelerate access throughput, thereby maintaining fast data access at all times.

The following diagram illustrates how a distributed cache can store fast-changing data to accelerate online performance and offload a back-end database server:

The Technology Behind Distributed Caching

It’s not enough simply to lash together a set of servers hosting a collection of in-memory caches. To be reliable and easy to use, distributed caches need to incorporate technology that provides important attributes, including ease of integration, location transparency, transparent scaling, and high availability with strong consistency. Let’s take a look at some of these capabilities.

To make distributed caches easy to use and keep them fast, they typically employ a “NoSQL” key/value access model and store values as serialized objects. This enables web applications to store, retrieve, and update instances of their application-defined objects (for example, shopping carts) using a simple key, such as a user’s unique identifier. This object-oriented approach allows distributed caches to be viewed as more of an extension of an application’s in-memory data storage than as a separate storage tier.

That said, a web application needs to interact with a distributed cache as a unified whole. It’s just too difficult for the application to keep track of which server within a distributed cache holds a given data object. For this reason, distributed caches handle all the bookkeeping required to keep track of where objects are stored. Applications simply present a key to the distributed cache, and the cache’s client library finds the object, regardless of which server holds it.

It’s also the distributed cache’s responsibility to distribute access requests across its farm of servers and scale throughput as servers are added. Linear scaling keeps access times low as the workload increases. Distributed caches typically use partitioning techniques to accomplish this. ScaleOut StateServer further integrates the cache’s partitioning with its client libraries so that scaling is both transparent to applications and automatic. When a server is added, the cache quietly rebalances the workload across all caching servers and makes the client libraries aware of the changes.

To enable their use in mission-critical applications, distributed caches need to be highly available, that is, to ensure that both stored data and access to the distributed cache can survive the failure of one of the servers. To accomplish this, distributed caches typically store each object on two (or more) servers. If a server fails, the cache detects this, removes the server from the farm, and then restores the redundancy of data storage in case another failure occurs.

When there are multiple copies of an object stored on different servers, it’s important to keep them consistent. Otherwise, stale data due to a missed update could inadvertently be returned to an application after a server fails. Unlike some distributed caches which use a simpler, “eventual” consistency model prone to this problem, ScaleOut StateServer uses a patented, quorum-based technique which ensures that all stored data is fully consistent.

There’s More: Parallel Query and Computing

Because a distributed cache stores memory-based objects on a farm of servers, it can harness the CPU power of the server farm to analyze stored data much faster than would be possible on a single server. For example, instead of just accessing individual objects using keys, it can query the servers in parallel to find all objects with specified properties. With ScaleOut StateServer, applications can use standard query mechanisms, such as Microsoft LINQ, to create parallel queries executed by the distributed cache.

Although they are powerful, parallel queries can overload both a requesting client and the network by returning a large number of query results. In many cases, it makes more sense to let the distributed cache perform the client’s work within the cache itself. ScaleOut StateServer provides an API called Parallel Method Invocation (and also a variant of .NET’s Parallel.ForEach called Distributed ForEach) which lets a client application ship code to the cache that processes the results of a parallel query and then returns merged results back to the client. Moving code to where the data lives accelerates processing while minimizing network usage.

Distributed Caches Can Help Now

Online web sites and services are now more vital than ever to keeping our daily activities moving forward. Since almost all large web sites use server farms to handle growing workloads, it’s no surprise that server farms can also offer a powerful and cost-effective hardware platform for hosting a site’s fast-changing data. Distributed caches harness the power of server farms to handle this important task and remove database bottlenecks. Also, with integrated parallel query and computing, distributed caches can now do much more to offload a site’s workload. This might be a good time to take a fresh look at the power of distributed caching.

The post Use Distributed Caching to Accelerate Online Web Sites appeared first on ScaleOut Software.

]]>

Use Parallel Analysis – Not Parallel Query – for Fast Data Access and Scalable Computing Power

Sat, 28 Jul 2018 01:02:33 +0000

For more than a decade, in-memory data grids (IMDGs) have proven their usefulness for storing fast-changing data in enterprise applications. Whether it’s ecommerce shopping carts, financial trading data, IoT telemetry, or airline reservations, these data sets need fast, reliable access for large, mission-critical workloads. Hosted on commodity clusters or cloud infrastructures, IMDGs harness the power of distributed computing to deliver scalable storage capacity and access throughput, along with integrated high availability. Looking beyond distributed caching, it’s their ability to perform data-parallel analysis that gives IMDGs such exciting capabilities.

To help ensure fast data access and scalability, IMDGs usually employ a straightforward key/value storage model. This model works well for storing large, object-oriented collections of business-logic state, such as the examples listed above. Each object is stored in the grid as a serialized version of the application’s in-memory counterpart and is accessed with a unique key defined within a namespace (as shown in the diagram below). (Namespaces typically hold objects of a single, language-defined type.) In contrast to relational, graph-oriented, and other more complex storage models, key/value stores usually deliver faster data access because key lookups can be quickly completed with low overhead.

Application developers often deploy IMDGs as a distributed cache that sits between an application and its database; the IMDG offloads ephemeral data from the database. For example, it can be used to host short-lived business logic state used to prepare transactions. Offloading the database boosts performance, reduces bottlenecks, and lowers costs.

When used as a cache, developers often view an IMDG with a database mindset and access grid data using traditional, database-oriented techniques, such as SQL query, instead of key-based lookup. For example, an object describing a customer might be retrieved by querying the customer’s last name instead of performing a key-based lookup of the customer’s unique account number. That’s where performance problems can begin.

When used in an IMDG, a typical query seeks to access a set of objects matching specific properties. Just as a relational database queries a table by attributes, an IMDG queries a distributed, in-memory, object collection by matching class-based properties. In both cases, many results may match a query and be returned to the requesting client. When used sparingly, IMDG queries work quite well. However, when applications rely on query as the primary access model, access throughput can be seriously degraded, and overall application performance can suffer in two ways.

First, unlike key-based access, which is directed to a specific grid server to retrieve an object, a query requires the participation of all grid servers. This enables the IMDG to find all matching objects, which potentially reside on multiple servers within the distributed store. So the overhead to perform a query requires O(N) overhead on a cluster of N grid servers, while a key lookup only requires O(1) overhead. Since an IMDG typically has many clients simultaneously making access requests, the combined overhead for many parallel queries can quickly grow. For this reason, query should be avoided when a key lookup will suffice.

A bigger problem with query is that when it matches many results, a large amount of data may need to be returned over the network to the requesting client for processing, as illustrated below. This can quickly saturate the network (and bog down the client). When you consider that an IMDG can easily host terabytes of data distributed across several servers, it’s not surprising to see a single query return 10s of megabytes (MB) of results. A gigabit network can only move about a peak of 128MB/sec (although delays can start increasing at about half that), so large queries can (and often do) overload the network. And after a query returns the requested objects to the client for processing, the client must then wade through them all, potentially creating another bottleneck on the client.

What’s interesting about this dilemma is that the IMDG’s apparent weakness is actually its key strength. It’s all in how the application looks at the problem to be solved. Instead of querying the grid, what if we just moved this work from the client into the grid and performed it there? This would enable the application to avoid bottlenecks and harness the IMDG’s scalable computing power to boost performance.

The technique is called data-parallel computing, and many IMDGs (like ScaleOut StateServer Pro®) provide APIs that make it easy to use in languages like Java, C#, and C++. In its simplest form, the application ships off a class-defined method (call it “Eval”) to execute in the grid along with a query specification, and the IMDG distributes the work across all of its servers, querying each server locally and then running the application’s method on the selected objects. The application optionally can define a second method (call it “Merge”) to combine the results and return them back to the client. Running a method in the grid can be compared to executing a stored procedure in a database. The following diagram illustrates this concept:

Using data-parallel computing instead of parallel query gives an application two big wins. First, moving the code to the data dramatically reduces the amount of data transferred over the network since the results of the computation are usually dramatically smaller than the size of the original query results. Second, the grid’s scalable computing power reduces execution time while avoiding a bottleneck in the client. This scales the application’s throughput as the size of the workload increases.

We have seen this computing model’s utility in countless applications. Consider, for example, a hedge fund storing portfolios of stocks in an IMDG as objects, where each portfolio tracks a given market sector (high tech, energy, healthcare, etc.). When the stock exchange’s ticker feed updates a stock price, the hedge fund needs to evaluate all corresponding portfolios to see if rebalancing is needed. The obvious way to implement this is to query the grid for all portfolios containing the updated stock and then analyze them in the client. However, this requires large amounts of data to cross the network and creates lots of work for the client. Instead, the client can simply kick off a data-parallel computation in the grid on all portfolios that contain the stock and let the grid perform this work quickly (and scalably). Using this technique, a hedge fund was able to see the time for rebalancing drop from several minutes to less than half a second.

Although it often masquerades as a distributed cache, an IMDG actually is a scalable, in-memory computing platform – not that different from a parallel supercomputer running on commodity hardware. With a small change in mindset, developers easily can harness its computing power to eliminate bottlenecks and reap big dividends in performance.

The post Use Parallel Analysis – Not Parallel Query – for Fast Data Access and Scalable Computing Power appeared first on ScaleOut Software.

]]>