VITA Technologies
  • VME
  • XMC
  • FMC
  • PMC
  • VNX
  • VPX
Menu
  • VME
  • XMC
  • FMC
  • PMC
  • VNX
  • VPX
  • Articles
  • White Papers
  • Products
  • News
Menu
  • Articles
  • White Papers
  • Products
  • News
  Articles  Moving into embedded supercomputing
Articles

Moving into embedded supercomputing

Ray Alderman, VITARay Alderman, VITA—August 3, 20120
FacebookTwitterPinterestLinkedInTumblrRedditVKWhatsAppEmail

Massachusetts Institute of Technology (MIT) and other universities announced new algorithms for data-driven applications recently. Advanced Fast Fourier Transforms (FFTs), SWARM algorithms for Unmanned Aerial Vehicles (UAVs) and Unmanned Underwater Vehicles (UUVs), algorithms for extracting “fat tail” data in radar/sonar applications, and new beam-forming algorithms for SIGnals INTelligence (SIGINT) are examples. These new algorithms need advanced supercomputing architectures such as VPX.

Start with a hypercube

At the May VITA Standards Organization (VSO) meetings, a proposal to add the profiles for 4-dimensional and 6-dimensional hypercubes in the VITA 65 (OpenVPX) specification was accepted. When you start hooking together 8 or more CPUs, you must think about computer architectures in greater than 3 dimensions. The first 4D architecture is a hypercube, a tesseract. Many of the new algorithm-driven applications could require more than 8 CPUs, so the fourth dimension is a good place to start.

In the early 1980s, David May and Robert Milne of Inmos developed a new microprocessor chip, the Transputer. They hooked 16 processors together, using the slow serial links, into a 4D hypercube architecture. The machine ran great, but the data links were way too slow. Each of the processors was data starved. Even with the multigigabit fabrics available today, processors in a hypercube are still data starved, depending on the data sharing patterns between the nodes.

In any n-dimensional architecture, the worst-case number of hops (how many nodes the data must pass through before it arrives at its destination) is the number of dimensions of the architecture (n). In a 4D hypercube, the worst-case number of hops is 4 (Figure 1). To overcome this hop latency, you must put the applications that share the most data on the CPU nodes that are closest to each other (parsing).


21

Figure 1: 4D hypercube, 4 nodes with 4 processors each. The shortest path between modules is never more than four links. Image courtesy of VITA.
(Click graphic to zoom by 1.9x)
More stories

Managing COTS obsolescence for military systems

December 5, 2016

Masks off!

November 15, 2022

Making real-time, multichannel video processing a reality: There’s an easier way

June 1, 2010

Static analysis and the impact of the target architecture

October 1, 2008

Hooking-up 16 processors

The n-dimensional architectures minimize the number of links on each node. The number of full-duplex links required per node is also the number of dimensions of the architecture (n). So, for a 4D hypercube (16 processors), each CPU board needs 4 bidirectional links. Compare that to the worst-case 2D architecture, a mesh of 16 CPUs: The number of links per node is (n-1), where (n) is the number of nodes, or 15 bidirectional links. The 15 links burn a lot more power, consume huge numbers of connector pins, and require too much board space that could be used for memory and other functions.

If you drop back to a 3D cube with 8 CPUs, the same rules apply. The number of bidirectional links for a 3D cube is 3 per node, and the worst-case number of hops is 3. Compare that to 8 CPUs in a mesh: 7 bidirectional links per node. When you start hooking together more than 8 CPUs, you must go to n-dimensional architectures to minimize the board space for the link chips, reduce power consumption, and minimize the number of connector pins.

Protocol kills

You can build some effective low-latency supercomputing architectures using the Publish-Subscribe (P-S) model. In a P-S architecture, you can use the switches available for the fabrics today of InfiniBand, Ethernet, Serial RapidIO, PCI Express, and so on.

The switches have a function called “broadcast” in which any node can send data to the switch, and that data will be sent to all the other nodes. For example, the data will be “published.” The other nodes can examine the packet header (“snooping”) and take the data. (The node is a “subscriber.”) Using the “broadcast” function avoids the heavy protocol stack overhead commonly found in the fabric chips. Many military applications might already be using the switch-chip broadcast function, implementing the P-S model in their new VPX systems.

Nothing new under the sun

We have been sort of doing this P-S model, but on a smaller scale, with VME boards. Several companies have sold “reflective memory” cards for years, which implement an elementary P-S model. These boards are used to build data recorders for military systems and other data-intensive applications, which required low-latency connections. Rather than publishing the data to multiple subscribers, the reflective-memory links send the data from one board to another (point-to-point).

So, get up to speed on n-dimensional architectures, hypercubes, and P-S models. The algorithm jockeys are driving us to multiprocessor VPX-based supercomputing systems at a rapid pace.For more information, contact Ray at [email protected].

FacebookTwitterPinterestLinkedInTumblrRedditVKWhatsAppEmail
Big VPX performance in a small box
Mercury Computer Systems Completes Acquisition of Micronetics
Related posts
  • Related posts
  • More from author
Eletter Products

SPONSORED: Rugged 1/2 ATR Aligned to SOSA, CMFF and SAVE Ready

January 30, 20250
Consortia and Working Groups

Call for Consensus Body Members to Reaffirm ANSI/VITA 67.1-2019 – Coaxial Interconnect on VPX, 4 Position SMPM Configuration

January 28, 20250
Eletter Products

SPONSORED: SAVE Compliant Chassis for VPX and SOSA Aligned Systems

January 28, 20250
Load more
Read also
Eletter Products

SPONSORED: Rugged 1/2 ATR Aligned to SOSA, CMFF and SAVE Ready

January 30, 20250
Consortia and Working Groups

Call for Consensus Body Members to Reaffirm ANSI/VITA 67.1-2019 – Coaxial Interconnect on VPX, 4 Position SMPM Configuration

January 28, 20250
Eletter Products

SPONSORED: SAVE Compliant Chassis for VPX and SOSA Aligned Systems

January 28, 20250
Eletter Products

SPONSORED: Introducing AirBorn’s 2300W+ VPX Power Supply

January 28, 20250
Consortia and Working Groups

VITA announces formation of VITA 100 working groups

January 13, 20250
Articles

VITA Technologies 2025 Application Guide is here!

December 13, 20240
Load more

Recent Comments

No comments to show.
  • Articles
  • White Papers
  • Products
  • News
Menu
  • Articles
  • White Papers
  • Products
  • News
  • VME
  • XMC
  • FMC
  • PMC
  • VNX
  • VPX
Menu
  • VME
  • XMC
  • FMC
  • PMC
  • VNX
  • VPX

© 2023 VITA Technologies. All rights Reserved.