Query Processing in a Symmetric Parallel Environment (Classic Reprint)

Excerpt from Query Processing in a Symmetric Parallel Environment

We consider a database machine consisting of n nodes connected by an O(n processing speed) bandwidth network. Each node consists of a processor, a random access memory, and a slower but much larger memory such as a disk. In order to approach optimal (O(n))speedup on this hardware architecture, we partition relations roughly evenly among the processors. We study the problem of optimizing multi-join queries assuming such a data distribution strategy. For us, optimization consists of minimizing the number of times we need to redistribute relations in the course of a query. The optimization problem is NP-complete for general queries but linear time for a subclass of tree queries.

1. Setting and Problem

The goal of the New York University Ultrabase project is to discover architectures and algorithms that speed up relational query processing roughly linearly with the number of processors. Fortunately for our choice of project title, we choose a symmetric parallel architecture (figure 1), generalizing the architecture of the New York University Ultracomputer [GGKARS83]. In this architecture, each processing node consists of a processor, a random access memory, and a slower but much larger memory. They are connected by a network whose bandwidth is proportional to the number of processors (That is, the network can sustain a communication load in which each processing node sends a message to an arbitrary processing node every c cycles, on the average, for some c unrelated to n.)

We justify our choice by an intuitive symmetry argument. To achieve an optimal speedup of, say, a join of R and S with the join clause R.A = S.B, each of the n processing nodes of our database machine should have about the same amount of work to do. This suggests that each processing node should contain approximately 1/n of the R and S tuples and should perform the join on those. In order for the result of the join to be correct, no tuple of R in processing node i should have the same A value as the B value of some S tuple in another processing node j. This implies that it may sometimes be necessary to send tuples from one processing node to another.

About the Publisher

Forgotten Books publishes hundreds of thousands of rare and classic books. Find more at www.forgottenbooks.com

This book is a reproduction of an important historical work. Forgotten Books uses state-of-the-art technology to digitally reconstruct the work, preserving the original format whilst repairing imperfections present in the aged copy. In rare cases, an imperfection in the original, such as a blemish or missing page, may be replicated in our edition. We do, however, repair the vast majority of imperfections successfully, any imperfections that remain are intentionally left to preserve the state of such historical works.

CHF 13.50

Lieferbar

ISBN	9781332092970
Sprache	eng
Cover	Kartonierter Einband (Kt)
Verlag	Forgotten Books
Jahr	2015

Kundenbewertungen

Dieser Artikel hat noch keine Bewertungen.