parallel query Archives - ScaleOut Software https://www.scaleoutsoftware.com/tag/parallel-query/ In-Memory Data Grids for the Enterprise Tue, 21 Feb 2023 19:25:05 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 Simulate at Scale with Digital Twins https://www.scaleoutsoftware.com/featured/simulate-at-scale-with-digital-twins/ https://www.scaleoutsoftware.com/featured/simulate-at-scale-with-digital-twins/#respond Tue, 21 Feb 2023 14:00:39 +0000 https://www.scaleoutsoftware.com/?p=12193   Digital Twins Can Implement Both Streaming Analytics and Simulations With the ScaleOut Digital Twin Streaming Service™, the digital twin software model has proven its versatility well beyond its roots in product lifecycle management (PLM). This cloud-based service uses digital twins to implement streaming analytics and add important contextual information not possible with other stream-processing […]

The post Simulate at Scale with Digital Twins appeared first on ScaleOut Software.

]]>
Header image with four pictures: smart city, power grid, logistics, and gas card purchase.

 

Digital Twins Can Implement Both Streaming Analytics and Simulations

With the ScaleOut Digital Twin Streaming Service™, the digital twin software model has proven its versatility well beyond its roots in product lifecycle management (PLM). This cloud-based service uses digital twins to implement streaming analytics and add important contextual information not possible with other stream-processing architectures. Because each digital twin can hold key information about an individual data source, it can enrich the analysis of incoming telemetry and extracts important, actionable insights without delay. Hosting digital twins on a scalable, in-memory computing platform enables the simultaneous tracking of thousands — or even millions — of data sources.

Owing to the digital twin’s object-oriented design, many diverse applications can take advantage of its powerful but easy-to-use software architecture. For example, telematics applications use digital twins to track telemetry from every vehicle in a fleet and immediately identify issues, such as lost or erratic drivers or emerging mechanical problems. Airlines can use digital twins to track the progress of passengers throughout an itinerary and respond to delays and cancellations with proactive remedies that smooth operations and reduce stress. Other applications abound, including health informatics, financial services, logistics, cybersecurity, IoT, smart cities, and crime prevention.

Here’s an example of a telematics application that tracks a large fleet of vehicles. Each vehicle has a corresponding digital twin analyzing telemetry from the vehicle in real time:

Image showing a fleet of vehicles in the USA. Each vehicle has a corresponding digital twin analyzing telemetry from the vehicle in real time.

Applications like these need to simultaneously track the dynamic behavior of numerous data sources, such as IoT devices, to identify issues (or opportunities) as quickly as possible and give systems managers the best possible situational awareness. To either validate streaming analytics code for a complex physical system or model its behavior, it is useful to simulate the devices and the telemetry that they generate. The ScaleOut Digital Twin Streaming Service now enables digital twins to simplify both tasks.

Use Digital Twins to Simulate a Workload for Streaming Analytics

Digital twins can implement a workload generator that generates telemetry used in validating streaming analytics code. Each digital twin models the behavior of a physical data source, such as a vehicle in fleet, and the messages it sends and receives. When running in simulation, thousands of digital twins can then generate realistic telemetry for all data sources and feed streaming analytics, such as a telematics application, designed to track and analyze its behavior. In fact, the streaming service enables digital twins to implement both the workload generator and the streaming analytics. Once the analytics code has been validated in this manner, developers can then deploy it to track a live system.

Here’s an example of using a digital twin to simulate the operations of a pump and the telemetry (such as the pump’s temperature and RPM) that it generates. Running in simulation, this simulated pump sends telemetry messages to a corresponding real-time digital twin that analyzes the telemetry to predict impending issues:

Once the simulation has validated the analytics, the real-time digital twin can be deployed to analyze telemetry from an actual pump:

Image of a data source sending messages to a real-time digital twin that analyzes the messages and enables data aggregation and visualization.

This example illustrates how digital twins can both simulate devices and provide streaming analytics for a live system.

Using digital twins to build a workload generator enables investigation of a wide range of scenarios that might be encountered in typical, real-world use. Developers can implement parameterizable, stateful models of physical data sources and then vary these parameters in simulation to evaluate the ability of streaming analytics to analyze and respond in various situations. For example, digital twins could simulate perimeter devices detecting security intrusions in a large infrastructure to help evaluate how well streaming analytics can identify and classify threats. In addition, the streaming service can capture and record live telemetry and later replay it in simulation.

Use Digital Twins to Simulate a Large System with Many Entities

In addition to using digital twins for analyzing telemetry, the ScaleOut Digital Twin Streaming Service enables digital twins to implement time-driven simulations that model large groups of interacting physical entities. Digital twins can model individual entities within a large system, such as airline passengers, aircraft, airport gates, and air traffic sectors in a comprehensive airline model. These digital twins maintain state information about the physical entities they represent, and they can run code at each time step in the simulation model’s execution to update digital twin state over time.  These digital twins also can exchange messages that model interactions.

For example, an airline tracking system can use simulation to model numerous types of weather delays and system outages (such as ground stops) to see how their system manages passenger needs. As the simulation model evolves over time, simulated aircraft can model flight delays and send messages to simulated passengers that react by updating their itineraries. Here is a depiction of an airline tracking simulation:

Image of airplanes, passengers, and airports as a digital twin simulation for an airline.

In contrast to the use of digital twins for PLM, which typically embody a complex design within a single digital twin model, the ScaleOut Digital Twin Streaming Service enables large numbers of physical entities and their interactions to be simulated. By doing this, simulations can model intricate behaviors that evolve over time and reveal important insights during system design and optimization. They also can be fed live data and run faster than real time as a tool for making predictions that assist decision-making by managers (such as airline dispatchers).

Scalable, In-Memory Computing Makes It Possible

Digital twins offer a compelling software architecture for implementing time-driven simulations with thousands of entities. In a typical implementation, developers create multiple digital twin models to describe the state information and simulation code representing various physical entities, such as trucks, cargo, and warehouses in a telematics simulation. They create instances of these digital twin models (simply called digital twins) to implement all of the entities being simulated, and the streaming service runs their code at each time step being simulated. During each time step, digital twins can exchange messages that represent simulated interactions between physical entities.

The ScaleOut Digital Twin Streaming Service uses scalable, in-memory computing technology to provide the speed and memory capacity needed to run large simulations with many entities. It stores digital twins in memory and automatically distributes them across a cluster of servers that hosts a simulation. At each time step, each server runs the simulation code for a subset of the digital twins and determines the next time step that the simulation needs to run. The streaming service orchestrates the simulation’s progress on the cluster and advances simulation time at a rate selected by the user.

In this manner, the streaming service can harness as many servers as it needs to host a large simulation and run it with maximum throughput. As illustrated below, the service’s in-memory computing platform can add new servers while a simulation is running, and it can transparently handle server outages should they occur. Users need only focus on building digital twin models and deploying them to the streaming service.

Image of airplanes and airports that demonstrates how in-memory computing can simulate at scale.

The Next Generation of Simulation with Digital Twins

Digital twins have historically been employed as a tool for simulating increasingly detailed behavior of a complex physical entity, like a jet engine. The ScaleOut Digital Twin Streaming Service takes digital twins in a new direction: simulation of large systems. Its highly scalable, in-memory computing architecture enables it to easily simulate many thousands of entities and their interactions. This provides a powerful new tool for extracting insights about complex systems that today’s managers must operate at peak efficiency. Its analytics and predictive capabilities promise to offer a high return on investment in many industries.

The post Simulate at Scale with Digital Twins appeared first on ScaleOut Software.

]]>
https://www.scaleoutsoftware.com/featured/simulate-at-scale-with-digital-twins/feed/ 0
Data-Parallel Computing: Better than Parallel Query https://www.scaleoutsoftware.com/featured/data-parallel-computing-better-than-parallel-query/ https://www.scaleoutsoftware.com/featured/data-parallel-computing-better-than-parallel-query/#respond Mon, 20 May 2019 22:58:40 +0000 https://www.scaleoutsoftware.com/?p=5389 Although invoking a parallel query on an in-memory data grid (IMDG) is fast and provides scalable throughput, it can easily create a network bottleneck and then overload the client with work. A query that matches a large set of objects can send huge amounts of data back to the invoking client, and this can saturate […]

The post Data-Parallel Computing: Better than Parallel Query appeared first on ScaleOut Software.

]]>
Although invoking a parallel query on an in-memory data grid (IMDG) is fast and provides scalable throughput, it can easily create a network bottleneck and then overload the client with work. A query that matches a large set of objects can send huge amounts of data back to the invoking client, and this can saturate the network. Once the client receives the query results, it must examine all of the objects to perform its desired work. Although query lookup within an IMDG’s cluster of servers takes advantage of scalable computing power, the network and the client have fixed resources which create significantly delays that extend the overall time required to complete all of the work.

Instead of using parallel query, consider running a data-parallel method within the IMDG to both query and perform the client’s work. ScaleOut StateServer® Pro offers an API for this purpose called “parallel method invocation” (and a variation of parallel foreach for .NET called “distributed foreach”). It lets a client application specify both a query expression and a Java or C# method to perform on the objects selected by the query, as well as a second method to combine the results and return a final, merged result back to the client. ScaleOut’s client libraries automatically ship the code and query spec to the IMDG, which runs everything in parallel on its cluster of servers before shipping back the final result. This ensures that all steps are performed in parallel for fast completion and scalable throughput. In addition, it eliminates network bottlenecks and reduces client CPU overhead.

So the next time you need to query objects held in an IMDG, consider using data-parallel computing instead. It’s fast and easier to use than you might think. Learn more about data-parallel computing here

The post Data-Parallel Computing: Better than Parallel Query appeared first on ScaleOut Software.

]]>
https://www.scaleoutsoftware.com/featured/data-parallel-computing-better-than-parallel-query/feed/ 0