Next: Finding Split Points
Up: Parallelizing Classification
Previous: Parallelizing Classification
- SPRINT distributes the attribute lists evenly over N processors of a
shared-nothing machine. Thus each processor works on only 1/N of the
total data.
- For performing parallel classification, the training-set examples are
distributed equally among the N processors.
- Each processor generates its own attribute-list partitions in parallel
by projecting out each attribute from the training-set examples it was
assigned.
- Continuous attribute lists are required to be sorted by the SPRINT
algorithm. For this parallel sorting algorithm is used. Continuous
attribute lists are first sorted by each processor locally and then are
are repartitioned into contiguous sorted sections.
DBMS
1999-03-11