ADVANCES IN KNOWLEDGE DISCOVERY IN DISTRIBUTED DATABASES Cover Image

ADVANCES IN KNOWLEDGE DISCOVERY IN DISTRIBUTED DATABASES
ADVANCES IN KNOWLEDGE DISCOVERY IN DISTRIBUTED DATABASES

Author(s): Valentin PUPEZESCU
Subject(s): Education
Published by: Carol I National Defence University Publishing House
Keywords: Knowledge Discovery in Distributed Databases; Data Mining; Distributed Databases; Knowledge Management; Neural Networks & Artificial Intelligence; Distributed-Committee Machines

Summary/Abstract: The Knowledge Discovery in Distributed Databases is the process of extracting useful information from a collection of data stored in distributed databases. A distributed database is a collection of data replicated over a number of different computers. The best-suited structures for working with distributed databases are the Distributed Committee-Machines. Distributed Committee-Machines are a combination of neural networks that work in a distributed manner as a group in order to obtain better performance than individual neural networks in solving data mining tasks inside the KDD process. In this paper, we aim to study the interaction between Distributed Committee-Machines and distributed databases. The process of replication on multiple machines can become very slow once the number of the machines from the replication topology grows. Such behavior is explicable because of the complex software that is used in real implementations of the replication process in order to make available the same data on multiple machines. In this paper, I propose a design that overcomes those disadvantages and a new type of approach in storing the neural networks. The developed system stores the entire neural network in real relational databases. The optimized DCM structure eliminates the problems inherited from replication by writing all the result locally in special tables that will not be replicated on all the distributed machines. Here I used also a new approach, which consists of storing the entire neural network in the table as BLOB (Binary Large Object) object. The method can be beneficial also in new types of eLearning techniques such as the adaptive eLearning method that uses neural networks. With the optimized design of DCM structures, the speedup in all the experiments is almost equal with the number of distributed machines that were used.

  • Issue Year: 11/2015
  • Issue No: 01
  • Page Range: 311-319
  • Page Count: 9